Literature DB >> 34257393

Predicting pathogenic non-coding SVs disrupting the 3D genome in 1646 whole cancer genomes using multiple instance learning.

Marleen M Nieboer1,2, Luan Nguyen1,2, Jeroen de Ridder3,4.   

Abstract

Over the past years, large consortia have been established to fuel the sequencing of whole genomes of many cancer patients. Despite the increased abundance in tools to study the impact of SNVs, non-coding SVs have been largely ignored in these data. Here, we introduce svMIL2, an improved version of our Multiple Instance Learning-based method to study the effect of somatic non-coding SVs disrupting boundaries of TADs and CTCF loops in 1646 cancer genomes. We demonstrate that svMIL2 predicts pathogenic non-coding SVs with an average AUC of 0.86 across 12 cancer types, and identifies non-coding SVs affecting well-known driver genes. The disruption of active (super) enhancers in open chromatin regions appears to be a common mechanism by which non-coding SVs exert their pathogenicity. Finally, our results reveal that the contribution of pathogenic non-coding SVs as opposed to driver SNVs may highly vary between cancers, with notably high numbers of genes being disrupted by pathogenic non-coding SVs in ovarian and pancreatic cancer. Taken together, our machine learning method offers a potent way to prioritize putatively pathogenic non-coding SVs and leverage non-coding SVs to identify driver genes. Moreover, our analysis of 1646 cancer genomes demonstrates the importance of including non-coding SVs in cancer diagnostics.
© 2021. The Author(s).

Entities:  

Year:  2021        PMID: 34257393     DOI: 10.1038/s41598-021-93917-y

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  55 in total

1.  A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors:  Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal:  Fly (Austin)       Date:  2012 Apr-Jun       Impact factor: 2.160

Review 2.  Role of non-coding sequence variants in cancer.

Authors:  Ekta Khurana; Yao Fu; Dimple Chakravarty; Francesca Demichelis; Mark A Rubin; Mark Gerstein
Journal:  Nat Rev Genet       Date:  2016-01-19       Impact factor: 53.242

3.  A method and server for predicting damaging missense mutations.

Authors:  Ivan A Adzhubei; Steffen Schmidt; Leonid Peshkin; Vasily E Ramensky; Anna Gerasimova; Peer Bork; Alexey S Kondrashov; Shamil R Sunyaev
Journal:  Nat Methods       Date:  2010-04       Impact factor: 28.547

4.  Predicting effects of noncoding variants with deep learning-based sequence model.

Authors:  Jian Zhou; Olga G Troyanskaya
Journal:  Nat Methods       Date:  2015-08-24       Impact factor: 28.547

Review 5.  The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers.

Authors:  Zbyslaw Sondka; Sally Bamford; Charlotte G Cole; Sari A Ward; Ian Dunham; Simon A Forbes
Journal:  Nat Rev Cancer       Date:  2018-11       Impact factor: 60.716

6.  SIFT web server: predicting effects of amino acid substitutions on proteins.

Authors:  Ngak-Leng Sim; Prateek Kumar; Jing Hu; Steven Henikoff; Georg Schneider; Pauline C Ng
Journal:  Nucleic Acids Res       Date:  2012-06-11       Impact factor: 16.971

7.  FATHMM-XF: accurate prediction of pathogenic point mutations via extended features.

Authors:  Mark F Rogers; Hashem A Shihab; Matthew Mort; David N Cooper; Tom R Gaunt; Colin Campbell
Journal:  Bioinformatics       Date:  2018-02-01       Impact factor: 6.937

8.  Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk.

Authors:  Jian Zhou; Chandra L Theesfeld; Kevin Yao; Kathleen M Chen; Aaron K Wong; Olga G Troyanskaya
Journal:  Nat Genet       Date:  2018-07-16       Impact factor: 38.330

9.  CADD: predicting the deleteriousness of variants throughout the human genome.

Authors:  Philipp Rentzsch; Daniela Witten; Gregory M Cooper; Jay Shendure; Martin Kircher
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

10.  Genome analysis and knowledge-driven variant interpretation with TGex.

Authors:  Dvir Dahary; Yaron Golan; Yaron Mazor; Ofer Zelig; Ruth Barshir; Michal Twik; Tsippi Iny Stein; Guy Rosner; Revital Kariv; Fei Chen; Qiang Zhang; Yiping Shen; Marilyn Safran; Doron Lancet; Simon Fishilevich
Journal:  BMC Med Genomics       Date:  2019-12-30       Impact factor: 3.063

View more
  1 in total

Review 1.  CTCF: A misguided jack-of-all-trades in cancer cells.

Authors:  Julie Segueni; Daan Noordermeer
Journal:  Comput Struct Biotechnol J       Date:  2022-05-27       Impact factor: 6.155

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.