BACKGROUND: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. RESULTS: By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. CONCLUSIONS: We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.
BACKGROUND: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. RESULTS: By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. CONCLUSIONS: We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.
Authors: Steffen Durinck; Christine Ho; Nicholas J Wang; Wilson Liao; Lakshmi R Jakkula; Eric A Collisson; Jennifer Pons; Sai-Wing Chan; Ernest T Lam; Catherine Chu; Kyunghee Park; Sung-woo Hong; Joe S Hur; Nam Huh; Isaac M Neuhaus; Siegrid S Yu; Roy C Grekin; Theodora M Mauro; James E Cleaver; Pui-Yan Kwok; Philip E LeBoit; Gad Getz; Kristian Cibulskis; Jon C Aster; Haiyan Huang; Elizabeth Purdom; Jian Li; Lars Bolund; Sarah T Arron; Joe W Gray; Paul T Spellman; Raymond J Cho Journal: Cancer Discov Date: 2011-06-29 Impact factor: 39.397
Authors: Charlotte K Y Ng; Susanna L Cooke; Kevin Howe; Scott Newman; Jian Xian; Jillian Temple; Elizabeth M Batty; Jessica C M Pole; Simon P Langdon; Paul A W Edwards; James D Brenton Journal: J Pathol Date: 2012-02-09 Impact factor: 7.996
Authors: B J Druker; M Talpaz; D J Resta; B Peng; E Buchdunger; J M Ford; N B Lydon; H Kantarjian; R Capdeville; S Ohno-Jones; C L Sawyers Journal: N Engl J Med Date: 2001-04-05 Impact factor: 91.245
Authors: Chris D Greenman; Erin D Pleasance; Scott Newman; Fengtang Yang; Beiyuan Fu; Serena Nik-Zainal; David Jones; King Wai Lau; Nigel Carter; Paul A W Edwards; P Andrew Futreal; Michael R Stratton; Peter J Campbell Journal: Genome Res Date: 2011-10-12 Impact factor: 9.043
Authors: Ryan E Mills; Klaudia Walter; Chip Stewart; Robert E Handsaker; Ken Chen; Can Alkan; Alexej Abyzov; Seungtai Chris Yoon; Kai Ye; R Keira Cheetham; Asif Chinwalla; Donald F Conrad; Yutao Fu; Fabian Grubert; Iman Hajirasouliha; Fereydoun Hormozdiari; Lilia M Iakoucheva; Zamin Iqbal; Shuli Kang; Jeffrey M Kidd; Miriam K Konkel; Joshua Korn; Ekta Khurana; Deniz Kural; Hugo Y K Lam; Jing Leng; Ruiqiang Li; Yingrui Li; Chang-Yun Lin; Ruibang Luo; Xinmeng Jasmine Mu; James Nemesh; Heather E Peckham; Tobias Rausch; Aylwyn Scally; Xinghua Shi; Michael P Stromberg; Adrian M Stütz; Alexander Eckehart Urban; Jerilyn A Walker; Jiantao Wu; Yujun Zhang; Zhengdong D Zhang; Mark A Batzer; Li Ding; Gabor T Marth; Gil McVean; Jonathan Sebat; Michael Snyder; Jun Wang; Kenny Ye; Evan E Eichler; Mark B Gerstein; Matthew E Hurles; Charles Lee; Steven A McCarroll; Jan O Korbel Journal: Nature Date: 2011-02-03 Impact factor: 49.962
Authors: Philip J Stephens; Chris D Greenman; Beiyuan Fu; Fengtang Yang; Graham R Bignell; Laura J Mudie; Erin D Pleasance; King Wai Lau; David Beare; Lucy A Stebbings; Stuart McLaren; Meng-Lay Lin; David J McBride; Ignacio Varela; Serena Nik-Zainal; Catherine Leroy; Mingming Jia; Andrew Menzies; Adam P Butler; Jon W Teague; Michael A Quail; John Burton; Harold Swerdlow; Nigel P Carter; Laura A Morsberger; Christine Iacobuzio-Donahue; George A Follows; Anthony R Green; Adrienne M Flanagan; Michael R Stratton; P Andrew Futreal; Peter J Campbell Journal: Cell Date: 2011-01-07 Impact factor: 41.582
Authors: Stefan C Dentro; Ignaty Leshchiner; Kerstin Haase; Maxime Tarabichi; Jeff Wintersinger; Amit G Deshwar; Kaixian Yu; Yulia Rubanova; Geoff Macintyre; Jonas Demeulemeester; Ignacio Vázquez-García; Kortine Kleinheinz; Dimitri G Livitz; Salem Malikic; Nilgun Donmez; Subhajit Sengupta; Pavana Anur; Clemency Jolly; Marek Cmero; Daniel Rosebrock; Steven E Schumacher; Yu Fan; Matthew Fittall; Ruben M Drews; Xiaotong Yao; Thomas B K Watkins; Juhee Lee; Matthias Schlesner; Hongtu Zhu; David J Adams; Nicholas McGranahan; Charles Swanton; Gad Getz; Paul C Boutros; Marcin Imielinski; Rameen Beroukhim; S Cenk Sahinalp; Yuan Ji; Martin Peifer; Inigo Martincorena; Florian Markowetz; Ville Mustonen; Ke Yuan; Moritz Gerstung; Paul T Spellman; Wenyi Wang; Quaid D Morris; David C Wedge; Peter Van Loo Journal: Cell Date: 2021-04-07 Impact factor: 41.582
Authors: Kevin Hadi; Xiaotong Yao; Julie M Behr; Aditya Deshpande; Charalampos Xanthopoulakis; Huasong Tian; Sarah Kudman; Joel Rosiene; Madison Darmofal; Joseph DeRose; Rick Mortensen; Emily M Adney; Alon Shaiber; Zoran Gajic; Michael Sigouros; Kenneth Eng; Jeremiah A Wala; Kazimierz O Wrzeszczyński; Kanika Arora; Minita Shah; Anne-Katrin Emde; Vanessa Felice; Mayu O Frank; Robert B Darnell; Mahmoud Ghandi; Franklin Huang; Sally Dewhurst; John Maciejowski; Titia de Lange; Jeremy Setton; Nadeem Riaz; Jorge S Reis-Filho; Simon Powell; David A Knowles; Ed Reznik; Bud Mishra; Rameen Beroukhim; Michael C Zody; Nicolas Robine; Kenji M Oman; Carissa A Sanchez; Mary K Kuhner; Lucian P Smith; Patricia C Galipeau; Thomas G Paulson; Brian J Reid; Xiaohong Li; David Wilkes; Andrea Sboner; Juan Miguel Mosquera; Olivier Elemento; Marcin Imielinski Journal: Cell Date: 2020-10-01 Impact factor: 66.850