Literature DB >> 31755900

NeoFuse: predicting fusion neoantigens from RNA sequencing data.

Georgios Fotakis1, Dietmar Rieder1, Marlene Haider1, Zlatko Trajanoski1, Francesca Finotello1.   

Abstract

SUMMARY: Gene fusions can generate immunogenic neoantigens that mediate anticancer immune responses. However, their computational prediction from RNA sequencing (RNA-seq) data requires deep bioinformatics expertise to assembly a computational workflow covering the prediction of: fusion transcripts, their translated proteins and peptides, Human Leukocyte Antigen (HLA) types, and peptide-HLA binding affinity. Here, we present NeoFuse, a computational pipeline for the prediction of fusion neoantigens from tumor RNA-seq data. NeoFuse can be applied to cancer patients' RNA-seq data to identify fusion neoantigens that might expand the repertoire of suitable targets for immunotherapy.
AVAILABILITY AND IMPLEMENTATION: NeoFuse source code and documentation are available under GPLv3 license at https://icbi.i-med.ac.at/NeoFuse/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Year:  2020        PMID: 31755900      PMCID: PMC7141848          DOI: 10.1093/bioinformatics/btz879

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Neoantigens are tumor-specific peptides arising from the expression of mutated genes in cancer cells. Class-I neoantigens, recognized as ‘non-self’ by CD8+ T cells, can elicit strong anticancer immune responses. Besides being major determinants of response to immune checkpoint blockers, neoantigens are at the basis of other cancer immunotherapies like personalized cancer vaccines and adoptive T cell therapy (Lee ). To date, most efforts have been directed at identifying neoantigens generated from missense somatic mutations (Finotello ). However, tumor-specific gene fusions, splicing isoforms, and expressed human endogenous retroviruses can also be a source of neoantigens (Smith ). A recent study in patients with head and neck cancer demonstrated that gene fusions generate immunogenic neoantigens that can mediate the response to immune checkpoint blockers in tumors with low mutational burden (Yang ). Computational strategies for the identification of fusion neoantigens from RNA sequencing (RNA-seq) data have been proposed recently (Rathe ; Richman ; Zhang ). However, all of them build upon pre-analysis with third-party tools to first predict fusion transcripts from tumor RNA-seq and, thus, deep bioinformatic expertise for the assembly of a full computational workflow. Here, we present NeoFuse, a user-friendly pipeline for the prediction of fusion neoantigens from tumor RNA-seq data. NeoFuse is available as Singularity (https://sylabs.io) and Docker (https://www.docker.com) images to simplify installation and analysis.

2 The NeoFuse pipeline

NeoFuse takes single-sample FASTQ files of RNA-seq reads as input and predicts putative fusion neoantigens through five analytical modules based on state-of-the-art computational tools (Fig. 1). Both single- and paired-end data can be used, but we advise using the latter to increase sensitivity and accuracy of fusion detection. The first module performs class-I Human Leukocyte Antigen (HLA) typing at 4-digit resolution using OptiType (Szolek ), which is one of the best performing methods for this task (Finotello ). The second module predicts fusion peptides using Arriba (https://github.com/suhrig/arriba), considering both fusion junctions and 3’ out-of-frame sequences. We chose Arriba because it outperformed competitor prediction methods in the DREAM Somatic Mutation Calling–RNA Challenge (https://www.synapse.org/SMC_RNA). Moreover, it computes a confidence score reflecting the likelihood that a fusion is caused by a tumor-specific genomic rearrangement and is not due to technical artifacts. The third module uses MHCflurry (O’Donnell ) to predict binding affinity of fusion peptides to HLA types, quantified as half maximal inhibitory concentration (IC50) and percentile rank. The fourth module leverages STAR (Dobin ) and featureCounts (Liao ) to quantify gene expression levels as transcripts per million. Finally, the fifth module selects a reduced set of peptides representing putative fusion neoantigens by considering their binding affinity and confidence score. Moreover, it annotates each neoantigen with IC50, percentile rank, confidence score, binding HLA type, expression of fusion and HLA genes and information about the presence of a premature stop codon that might cause nonsense-mediated decay of the fusion transcript.
Fig. 1.

Schematization of the NeoFuse pipeline: computational modules represented as dark-grey boxes (with tool names in square brackets) and output files as light-grey boxes

Schematization of the NeoFuse pipeline: computational modules represented as dark-grey boxes (with tool names in square brackets) and output files as light-grey boxes NeoFuse is available as ready-to-use Singularity and Docker images containing all the necessary software and dependencies. This allows running the pipeline in an isolated environment, preventing conflicts with other programs in the hosting environment. Local installation of the images is performed automatically by the NeoFuse bash script depending on the user’s choice (‘–docker’ or ‘–singularity’ option). Although not distributed as part of NeoFuse, netMHCpan (Jurtz ) can be used for peptide-HLA binding prediction instead of MHCflurry, provided that a local installation is available (see online documentation).

3 Applications

To assess the performance of the gene fusion prediction module, we tested Arriba and other state-of-the-art tools on two benchmark RNA-seq datasets (Supplementary Fig. S1). When used with the ‘-c M’ parameter setting to select fusions with medium and high confidence scores, Arriba together with STAR-Fusion (Haas ) resulted the best performer in terms of validated fusions identified, while also limiting the total number of called fusions. More aggressive or conservative solutions could be obtained using the ‘-c L’ or ‘-c H’ options, respectively. The analysis of each dataset with Arriba took, on average, 6 min on a high performance computing node (HP XL230a in Apollo 6000) utilizing 10 cores (Intel E5-2699A v4, 2.4 GHz) per sample. As a test case, we analyzed eight RNA-seq datasets from the MCF7 breast cancer cell line (Supplementary Table S1), selecting fusions with medium and high Arriba confidence score and peptides with an IC50 lower than 500 nM (‘-t 500’ option). On average, we identified 144 putative neoantigens from 40 gene fusions, with 83.96% of gene fusions characterized by out-of-frame sequences. The latter result suggests that gene fusions can be a source of neoantigens whose sequences are extremely different from that of self-peptides. Fusions shared across all datasets included gene pairs previously validated experimentally (BCAS4-BCAS3) or identified with computational methods (ABCA5-PPP4R1L, DEPDC1B-ELOVL7) (Picco ). OptiType predicted the correct HLA genotypes for the SRR1035698 dataset, but called homozygous HLA-B alleles for the datasets with a low expression (Supplementary Table S2) and, thus, read coverage of this gene (Supplementary Fig. S2).

4 Conclusions

NeoFuse is a novel computational pipeline to predict fusion neoantigens from tumor RNA-seq data. It is based on state-of-the-art computational tools and is available as ready-to-use Singularity and Docker images to ease installation and usage, requiring limited bioinformatic expertise. NeoFuse can be easily applied to RNA-seq data from patients with different cancer types. Thus, it can be used to identify fusion neoantigens that can broaden the repertoire of candidates for therapeutic cancer vaccination and T cell-based therapy and might ultimately extend the clinical benefit of immunotherapy to patients with low tumor mutational burden. In the near future, we plan to extend NeoFuse to the prediction of class-II fusion neoantigens recognized by CD4+ T cells.

Funding

This work was supported by the Austrian Cancer Aid/Tyrol [project n. 17003 to F.F.], by the Austrian Science Fund (FWF) [project n. T 974-B30 to F.F.] and by the European Research Council [advanced grant agreement n. 786295 to Z.T.]. Conflict of Interest: none declared. Click here for additional data file.
  13 in total

1.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features.

Authors:  Yang Liao; Gordon K Smyth; Wei Shi
Journal:  Bioinformatics       Date:  2013-11-13       Impact factor: 6.937

2.  MHCflurry: Open-Source Class I MHC Binding Affinity Prediction.

Authors:  Timothy J O'Donnell; Alex Rubinsteyn; Maria Bonsack; Angelika B Riemer; Uri Laserson; Jeff Hammerbacher
Journal:  Cell Syst       Date:  2018-06-27       Impact factor: 10.304

3.  STAR: ultrafast universal RNA-seq aligner.

Authors:  Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras
Journal:  Bioinformatics       Date:  2012-10-25       Impact factor: 6.937

4.  NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data.

Authors:  Vanessa Jurtz; Sinu Paul; Massimo Andreatta; Paolo Marcatili; Bjoern Peters; Morten Nielsen
Journal:  J Immunol       Date:  2017-10-04       Impact factor: 5.422

5.  Neoantigen Dissimilarity to the Self-Proteome Predicts Immunogenicity and Response to Immune Checkpoint Blockade.

Authors:  Lee P Richman; Robert H Vonderheide; Andrew J Rech
Journal:  Cell Syst       Date:  2019-10-09       Impact factor: 10.304

Review 6.  Alternative tumour-specific antigens.

Authors:  Christof C Smith; Sara R Selitsky; Shengjie Chai; Paul M Armistead; Benjamin G Vincent; Jonathan S Serody
Journal:  Nat Rev Cancer       Date:  2019-07-05       Impact factor: 60.716

Review 7.  Next-generation computational tools for interrogating cancer immunity.

Authors:  Francesca Finotello; Dietmar Rieder; Hubert Hackl; Zlatko Trajanoski
Journal:  Nat Rev Genet       Date:  2019-09-12       Impact factor: 59.581

8.  INTEGRATE-neo: a pipeline for personalized gene fusion neoantigen discovery.

Authors:  Jin Zhang; Elaine R Mardis; Christopher A Maher
Journal:  Bioinformatics       Date:  2017-02-15       Impact factor: 6.937

9.  Immunogenic neoantigens derived from gene fusions stimulate T cell responses.

Authors:  Wei Yang; Ken-Wing Lee; Raghvendra M Srivastava; Fengshen Kuo; Chirag Krishna; Diego Chowell; Vladimir Makarov; Douglas Hoen; Martin G Dalin; Leonard Wexler; Ronald Ghossein; Nora Katabi; Zaineb Nadeem; Marc A Cohen; S Ken Tian; Nicolas Robine; Kanika Arora; Heather Geiger; Phaedra Agius; Nancy Bouvier; Kety Huberman; Katelynd Vanness; Jonathan J Havel; Jennifer S Sims; Robert M Samstein; Rajarsi Mandal; Justin Tepe; Ian Ganly; Alan L Ho; Nadeem Riaz; Richard J Wong; Neerav Shukla; Timothy A Chan; Luc G T Morris
Journal:  Nat Med       Date:  2019-04-22       Impact factor: 53.440

10.  Identification of candidate neoantigens produced by fusion transcripts in human osteosarcomas.

Authors:  Susan K Rathe; Flavia E Popescu; James E Johnson; Adrienne L Watson; Tracy A Marko; Branden S Moriarity; John R Ohlfest; David A Largaespada
Journal:  Sci Rep       Date:  2019-01-23       Impact factor: 4.996

View more
  8 in total

Review 1.  Neoantigen prediction and computational perspectives towards clinical benefit: recommendations from the ESMO Precision Medicine Working Group.

Authors:  L De Mattos-Arruda; M Vazquez; F Finotello; R Lepore; E Porta; J Hundal; P Amengual-Rigo; C K Y Ng; A Valencia; J Carrillo; T A Chan; V Guallar; N McGranahan; J Blanco; M Griffith
Journal:  Ann Oncol       Date:  2020-06-28       Impact factor: 32.976

Review 2.  Computational cancer neoantigen prediction: current status and recent advances.

Authors:  G Fotakis; Z Trajanoski; D Rieder
Journal:  Immunooncol Technol       Date:  2021-11-20

3.  Development and Characterization of MYB-NFIB Fusion Expression in Adenoid Cystic Carcinoma.

Authors:  Joseph O Humtsoe; Hyun-Su Kim; Leilani Jones; James Cevallos; Philippe Boileau; Fengshen Kuo; Luc G T Morris; Patrick Ha
Journal:  Cancers (Basel)       Date:  2022-04-30       Impact factor: 6.575

4.  ProTECT-Prediction of T-Cell Epitopes for Cancer Therapy.

Authors:  Arjun A Rao; Ada A Madejska; Jacob Pfeil; Benedict Paten; Sofie R Salama; David Haussler
Journal:  Front Immunol       Date:  2020-11-10       Impact factor: 7.561

5.  LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing.

Authors:  Qian Liu; Yu Hu; Andres Stucky; Li Fang; Jiang F Zhong; Kai Wang
Journal:  BMC Genomics       Date:  2020-12-29       Impact factor: 3.969

6.  nextNEOpi: a comprehensive pipeline for computational neoantigen prediction.

Authors:  Dietmar Rieder; Georgios Fotakis; Markus Ausserhofer; Geyeregger René; Wolfgang Paster; Zlatko Trajanoski; Francesca Finotello
Journal:  Bioinformatics       Date:  2021-11-12       Impact factor: 6.931

Review 7.  Immunosurveillance and Immunoediting of Lung Cancer: Current Perspectives and Challenges.

Authors:  Kei Kunimasa; Taichiro Goto
Journal:  Int J Mol Sci       Date:  2020-01-17       Impact factor: 5.923

Review 8.  T Cell Epitope Prediction and Its Application to Immunotherapy.

Authors:  Anna-Lisa Schaap-Johansen; Milena Vujović; Annie Borch; Sine Reker Hadrup; Paolo Marcatili
Journal:  Front Immunol       Date:  2021-09-15       Impact factor: 7.561

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.