Literature DB >> 28605406

CloudNeo: a cloud pipeline for identifying patient-specific tumor neoantigens.

Preeti Bais1, Sandeep Namburi1, Daniel M Gatti2, Xinyu Zhang1, Jeffrey H Chuang1,3.   

Abstract

SUMMARY: We present CloudNeo, a cloud-based computational workflow for identifying patient-specific tumor neoantigens from next generation sequencing data. Tumor-specific mutant peptides can be detected by the immune system through their interactions with the human leukocyte antigen complex, and neoantigen presence has recently been shown to correlate with anti T-cell immunity and efficacy of checkpoint inhibitor therapy. However computing capabilities to identify neoantigens from genomic sequencing data are a limiting factor for understanding their role. This challenge has grown as cancer datasets become increasingly abundant, making them cumbersome to store and analyze on local servers. Our cloud-based pipeline provides scalable computation capabilities for neoantigen identification while eliminating the need to invest in local infrastructure for data transfer, storage or compute. The pipeline is a Common Workflow Language (CWL) implementation of human leukocyte antigen (HLA) typing using Polysolver or HLAminer combined with custom scripts for mutant peptide identification and NetMHCpan for neoantigen prediction. We have demonstrated the efficacy of these pipelines on Amazon cloud instances through the Seven Bridges Genomics implementation of the NCI Cancer Genomics Cloud, which provides graphical interfaces for running and editing, infrastructure for workflow sharing and version tracking, and access to TCGA data.
AVAILABILITY AND IMPLEMENTATION: The CWL implementation is at: https://github.com/TheJacksonLaboratory/CloudNeo. For users who have obtained licenses for all internal software, integrated versions in CWL and on the Seven Bridges Cancer Genomics Cloud platform (https://cgc.sbgenomics.com/, recommended version) can be obtained by contacting the authors. CONTACT: jeff.chuang@jax.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28605406      PMCID: PMC5870764          DOI: 10.1093/bioinformatics/btx375

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Mutations in tumor genomes create specific peptide changes that can be recognized by the immune system and influence sensitivity to immunotherapy (van der Most ; van Rooij ). The mechanism of action involves binding of native major histocompatibility complex (MHC) class I and II molecules, a.k.a. human leukocyte antigen (HLA) complex I and II molecules, to the novel peptide sequences that result from protein-changing somatic mutations in cancer cells. Cells presenting these neoantigens are recognized as foreign by T-cells, which then selectively destroy them. With the arrival of new next generation sequencing platforms, it has become possible to interrogate the genomes of patient tumors and computationally predict T-cell reactivity against putative mutation-derived neoantigens (Schumacher ) by estimating the binding of MHC class I molecules to each new peptide sequence. Several bioinformatics tools are routinely used to predict tumor neoantigen—MHC class I binding from sequencing data. For example, HLAMiner (Warren ) and Polysolver (Shukla ) are software tools that can predict patient-specific HLA classes I and II typing from RNA sequencing data, and netMHCpan (Nielsen ) predicts HLA-peptide binding. Prior studies in cancer immunotherapy have successfully used these tools to predict the efficacy of immuno-oncological therapies in a patient-specific manner (Rizvi ; Van Allen ), demonstrating the importance of making such methods easily available to the general research community. However, the cost of developing and maintaining the bioinformatics infrastructure to perform this type of analysis is substantial. In particular, research groups are generating increasing amounts of custom sequencing data or investigating massive consortium datasets such as The Cancer Genome Atlas (Weinstein ), for which data transfer and scalability of computing can be significant obstacles to analysis on local compute clusters. To resolve these problems, we have developed a cloud-based analysis pipeline for tumor neoantigen detection.

2 Description

We developed the CloudNeo pipeline on the Seven Bridges cloud platform as part of the National Cancer Institute’s Cancer Genomics Cloud [http://www.cancergenomicscloud.org/] (CGC), which uses Docker containers to execute the tasks in the workflow. Briefly, CloudNeo takes a vcf file (for mutations) and bam file (for HLA typing) as inputs and then outputs HLA binding affinity predictions for all mutated peptides (see Supplementary Fig. S1). A first input to CloudNeo is a list of non-synonymous mutations in vcf file format. There are multiple somatic mutation calling pipelines that can be used to generate and filter this vcf file (Alioto ), including several which are available through the CGC. The genomic variants are translated into amino acid changes using the VEP tool (McLaren ) and a custom R script that we have created called Protein_Translator. The output of the custom tool is a list of N-amino-acid-long peptide sequences in a fasta format, such that the single peptide change is in the middle of the N-mer. In parallel, Protein_Translator generates another fasta file for the homologous N-mers with no peptide mutation. Users have options to calculate the HLA types using either HLAminer or Polysolver. Six HLA types are predicted, namely the top two predictions for each of HLA-A, HLA-B and HLA-C. The final step in the pipeline is the NetMHCpan tool, which uses the HLA types and the N-mer mutant peptide sequences to calculate the binding affinities for potential neoantigens. Affinities between the two HLA-A, two HLA-B, and two HLA-C molecules and each of the ([N/2]+1)mer peptide subsequences within the N-mers are computed. The output of the pipeline is a list of peptide subsequences along with the MHC binding affinity scores for each of the six HLA types. Similar results are generated for the homologous unmutated peptide sequences as a comparison. To test this pipeline, we analyzed 23 melanoma tumor samples (Hugo ) as described earlier using both the HLAminer and Polysolver versions of the pipeline. We then predicted neo-antigens based on criteria of strong mutant-MHC binding affinity (NetMHCpan score < 500), non-zeroexpression of the transcript containing the mutation, and lack of strong affinity between the non-mutated sequence and the MHC (NetMHCpan score for the non-mutant sequence ≥ 500). For each sample we merged the set of neoepitopes predicted across the six HLA types. The neoepitope load ranged from 0 to 1244 with an average of 107.89 using the HLAminer version of the pipeline. For the Polysolver version of the pipeline, the same filtering criteria were used and the neoepitope load was from 0 to 1417 with an average load of 133.53. The differences in the two pipeline results were due to differing HLA type predictions by Polysolver and HLAminer. 16 HLAtype predictions by the tools overlapped with each other, and there were 102 unique HLA predictions from Polysolver and 122 unique predictions from HLAminer. While our HLA type predictions were based on RNA-seq data, CloudNeo can also use DNA data as inputs for HLA calling. The average wall time required to run the pipeline for a given tumor on CGC was 8 h and 2 min for the HLAminer version and 7 h and 25 min for the Polysolver version (see Supplementary Material ‘Pipeline Performance’).

3 Discussion

Other recent methods, such as (Hundal ), are similar to CloudNeo in providing a computational pipeline for neoantigen prediction. However, to our knowledge CloudNeo is the only such pipeline that has been developed for cloud computing. This allows users to realize advantages of cloud analysis, including massive computing scalability and access to large datasets on the CGC such as TCGA, as these can be reached without downloading to a local server. This cloud approach also makes CloudNeo easy to match to time and budget restrictions on demand, providing a flexible computational approach for the research community. A version of the CloudNeo pipeline is openly available at the Github site as a Common Workflow Language (CWL) implementation that can be run using Rabix (Kaushik ), allowing for running on systems including AWS, Google Compute Engine and Azure. Licenses for academically licensed software (HLAminer and NetMHCpan) must be obtained by users, but simple instructions to do so are provided at the Github site. Users with licenses can also contact the authors to request a version with all software integrated. Full versions are available either in CWL or as a workflow on the Seven Bridges implementation of the CGC. The CGC version is recommended, as this provides additional functionality including graphical interfaces for running and editing, simple workflow sharing and version tracking, improved calling of multiple cloud instances, and access to TCGA data. Full details and docs are at https://github.com/TheJacksonLaboratory/CloudNeo. Click here for additional data file.
  14 in total

1.  Analysis of cytotoxic T cell responses to dominant and subdominant epitopes during acute and chronic lymphocytic choriomeningitis virus infection.

Authors:  R G van der Most; A Sette; C Oseroff; J Alexander; K Murali-Krishna; L L Lau; S Southwood; J Sidney; R W Chesnut; M Matloubian; R Ahmed
Journal:  J Immunol       Date:  1996-12-15       Impact factor: 5.422

2.  RABIX: AN OPEN-SOURCE WORKFLOW EXECUTOR SUPPORTING RECOMPUTABILITY AND INTEROPERABILITY OF WORKFLOW DESCRIPTIONS.

Authors:  Gaurav Kaushik; Sinisa Ivkovic; Janko Simonovic; Nebojsa Tijanic; Brandi Davis-Dusenbery; Deniz Kural
Journal:  Pac Symp Biocomput       Date:  2017

3.  Genomic correlates of response to CTLA-4 blockade in metastatic melanoma.

Authors:  Eliezer M Van Allen; Diana Miao; Bastian Schilling; Sachet A Shukla; Christian Blank; Lisa Zimmer; Antje Sucker; Uwe Hillen; Marnix H Geukes Foppen; Simone M Goldinger; Jochen Utikal; Jessica C Hassel; Benjamin Weide; Katharina C Kaehler; Carmen Loquai; Peter Mohr; Ralf Gutzmer; Reinhard Dummer; Stacey Gabriel; Catherine J Wu; Dirk Schadendorf; Levi A Garraway
Journal:  Science       Date:  2015-09-10       Impact factor: 47.728

4.  The Cancer Genome Atlas Pan-Cancer analysis project.

Authors:  John N Weinstein; Eric A Collisson; Gordon B Mills; Kenna R Mills Shaw; Brad A Ozenberger; Kyle Ellrott; Ilya Shmulevich; Chris Sander; Joshua M Stuart
Journal:  Nat Genet       Date:  2013-10       Impact factor: 38.330

5.  Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer.

Authors:  Naiyer A Rizvi; Matthew D Hellmann; Alexandra Snyder; Pia Kvistborg; Vladimir Makarov; Jonathan J Havel; William Lee; Jianda Yuan; Phillip Wong; Teresa S Ho; Martin L Miller; Natasha Rekhtman; Andre L Moreira; Fawzia Ibrahim; Cameron Bruggeman; Billel Gasmi; Roberta Zappasodi; Yuka Maeda; Chris Sander; Edward B Garon; Taha Merghoub; Jedd D Wolchok; Ton N Schumacher; Timothy A Chan
Journal:  Science       Date:  2015-03-12       Impact factor: 47.728

6.  Tumor exome analysis reveals neoantigen-specific T-cell reactivity in an ipilimumab-responsive melanoma.

Authors:  Nienke van Rooij; Marit M van Buuren; Daisy Philips; Arno Velds; Mireille Toebes; Bianca Heemskerk; Laura J A van Dijk; Sam Behjati; Henk Hilkmann; Dris El Atmioui; Marja Nieuwland; Michael R Stratton; Ron M Kerkhoven; Can Kesmir; John B Haanen; Pia Kvistborg; Ton N Schumacher
Journal:  J Clin Oncol       Date:  2013-09-16       Impact factor: 44.544

7.  Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes.

Authors:  Sachet A Shukla; Michael S Rooney; Mohini Rajasagi; Grace Tiao; Philip M Dixon; Michael S Lawrence; Jonathan Stevens; William J Lane; Jamie L Dellagatta; Scott Steelman; Carrie Sougnez; Kristian Cibulskis; Adam Kiezun; Nir Hacohen; Vladimir Brusic; Catherine J Wu; Gad Getz
Journal:  Nat Biotechnol       Date:  2015-11       Impact factor: 54.908

8.  Derivation of HLA types from shotgun sequence datasets.

Authors:  René L Warren; Gina Choe; Douglas J Freeman; Mauro Castellarin; Sarah Munro; Richard Moore; Robert A Holt
Journal:  Genome Med       Date:  2012-12-10       Impact factor: 11.117

9.  A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

Authors:  Tyler S Alioto; Ivo Buchhalter; Sophia Derdak; Barbara Hutter; Matthew D Eldridge; Eivind Hovig; Lawrence E Heisler; Timothy A Beck; Jared T Simpson; Laurie Tonon; Anne-Sophie Sertier; Ann-Marie Patch; Natalie Jäger; Philip Ginsbach; Ruben Drews; Nagarajan Paramasivam; Rolf Kabbe; Sasithorn Chotewutmontri; Nicolle Diessl; Christopher Previti; Sabine Schmidt; Benedikt Brors; Lars Feuerbach; Michael Heinold; Susanne Gröbner; Andrey Korshunov; Patrick S Tarpey; Adam P Butler; Jonathan Hinton; David Jones; Andrew Menzies; Keiran Raine; Rebecca Shepherd; Lucy Stebbings; Jon W Teague; Paolo Ribeca; Francesc Castro Giner; Sergi Beltran; Emanuele Raineri; Marc Dabad; Simon C Heath; Marta Gut; Robert E Denroche; Nicholas J Harding; Takafumi N Yamaguchi; Akihiro Fujimoto; Hidewaki Nakagawa; Víctor Quesada; Rafael Valdés-Mas; Sigve Nakken; Daniel Vodák; Lawrence Bower; Andrew G Lynch; Charlotte L Anderson; Nicola Waddell; John V Pearson; Sean M Grimmond; Myron Peto; Paul Spellman; Minghui He; Cyriac Kandoth; Semin Lee; John Zhang; Louis Létourneau; Singer Ma; Sahil Seth; David Torrents; Liu Xi; David A Wheeler; Carlos López-Otín; Elías Campo; Peter J Campbell; Paul C Boutros; Xose S Puente; Daniela S Gerhard; Stefan M Pfister; John D McPherson; Thomas J Hudson; Matthias Schlesner; Peter Lichter; Roland Eils; David T W Jones; Ivo G Gut
Journal:  Nat Commun       Date:  2015-12-09       Impact factor: 14.919

10.  pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens.

Authors:  Jasreet Hundal; Beatriz M Carreno; Allegra A Petti; Gerald P Linette; Obi L Griffith; Elaine R Mardis; Malachi Griffith
Journal:  Genome Med       Date:  2016-01-29       Impact factor: 11.117

View more
  21 in total

1.  Design of Personalized Neoantigen RNA Vaccines Against Cancer Based on Next-Generation Sequencing Data.

Authors:  Begoña Alburquerque-González; María Dolores López-Abellán; Ginés Luengo-Gil; Silvia Montoro-García; Pablo Conesa-Zamora
Journal:  Methods Mol Biol       Date:  2022

Review 2.  Computational cancer neoantigen prediction: current status and recent advances.

Authors:  G Fotakis; Z Trajanoski; D Rieder
Journal:  Immunooncol Technol       Date:  2021-11-20

Review 3.  Cloud computing for genomic data analysis and collaboration.

Authors:  Ben Langmead; Abhinav Nellore
Journal:  Nat Rev Genet       Date:  2018-01-30       Impact factor: 53.242

4.  pVACtools: A Computational Toolkit to Identify and Visualize Cancer Neoantigens.

Authors:  Jasreet Hundal; Susanna Kiwala; Joshua McMichael; Christopher A Miller; Huiming Xia; Alexander T Wollam; Connor J Liu; Sidi Zhao; Yang-Yang Feng; Aaron P Graubert; Amber Z Wollam; Jonas Neichin; Megan Neveau; Jason Walker; William E Gillanders; Elaine R Mardis; Obi L Griffith; Malachi Griffith
Journal:  Cancer Immunol Res       Date:  2020-01-06       Impact factor: 11.151

5.  Population-level distribution and putative immunogenicity of cancer neoepitopes.

Authors:  Mary A Wood; Mayur Paralkar; Mihir P Paralkar; Austin Nguyen; Adam J Struck; Kyle Ellrott; Adam Margolin; Abhinav Nellore; Reid F Thompson
Journal:  BMC Cancer       Date:  2018-04-13       Impact factor: 4.430

Review 6.  The perfect personalized cancer therapy: cancer vaccines against neoantigens.

Authors:  Luigi Aurisicchio; Matteo Pallocca; Gennaro Ciliberto; Fabio Palombo
Journal:  J Exp Clin Cancer Res       Date:  2018-04-20

7.  Challenges targeting cancer neoantigens in 2021: a systematic literature review.

Authors:  Ina Chen; Michael Y Chen; S Peter Goedegebuure; William E Gillanders
Journal:  Expert Rev Vaccines       Date:  2021-06-09       Impact factor: 5.683

8.  Mutations in DNA repair genes are associated with increased neo-antigen load and activated T cell infiltration in lung adenocarcinoma.

Authors:  Young Kwang Chae; Jonathan F Anker; Preeti Bais; Sandeep Namburi; Francis J Giles; Jeffrey H Chuang
Journal:  Oncotarget       Date:  2017-12-15

Review 9.  Using Semantic Web Technologies to Enable Cancer Genomics Discovery at Petabyte Scale.

Authors:  Jovan Cejovic; Jelena Radenkovic; Vladimir Mladenovic; Adam Stanojevic; Milica Miletic; Stevan Radanovic; Dragan Bajcic; Dragan Djordjevic; Filip Jelic; Milos Nesic; Jessica Lau; Patrick Grady; Nick Groves-Kirkby; Deniz Kural; Brandi Davis-Dusenbery
Journal:  Cancer Inform       Date:  2018-09-28

Review 10.  Identifying neoantigens for use in immunotherapy.

Authors:  Sharon Hutchison; Antonia L Pritchard
Journal:  Mamm Genome       Date:  2018-08-24       Impact factor: 2.957

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.