Krzysztof Polański1, Matthew D Young1, Zhichao Miao1,2, Kerstin B Meyer1, Sarah A Teichmann1,3, Jong-Eun Park1. 1. Cellular Genetics, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. 2. European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK. 3. Theory of Condensed Matter Group, Cavendish Laboratory/Department of Physics, University of Cambridge, Cambridge CB3 0HE, UK.
Abstract
MOTIVATION: Increasing numbers of large scale single cell RNA-Seq projects are leading to a data explosion, which can only be fully exploited through data integration. A number of methods have been developed to combine diverse datasets by removing technical batch effects, but most are computationally intensive. To overcome the challenge of enormous datasets, we have developed BBKNN, an extremely fast graph-based data integration algorithm. We illustrate the power of BBKNN on large scale mouse atlasing data, and favourably benchmark its run time against a number of competing methods. AVAILABILITY AND IMPLEMENTATION: BBKNN is available at https://github.com/Teichlab/bbknn, along with documentation and multiple example notebooks, and can be installed from pip. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Increasing numbers of large scale single cell RNA-Seq projects are leading to a data explosion, which can only be fully exploited through data integration. A number of methods have been developed to combine diverse datasets by removing technical batch effects, but most are computationally intensive. To overcome the challenge of enormous datasets, we have developed BBKNN, an extremely fast graph-based data integration algorithm. We illustrate the power of BBKNN on large scale mouse atlasing data, and favourably benchmark its run time against a number of competing methods. AVAILABILITY AND IMPLEMENTATION: BBKNN is available at https://github.com/Teichlab/bbknn, along with documentation and multiple example notebooks, and can be installed from pip. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Jesse W Williams; Holger Winkels; Christopher P Durant; Konstantin Zaitsev; Yanal Ghosheh; Klaus Ley Journal: Circ Res Date: 2020-04-23 Impact factor: 17.367
Authors: Jordan W Squair; Michael A Skinnider; Matthieu Gautier; Leonard J Foster; Grégoire Courtine Journal: Nat Protoc Date: 2021-06-25 Impact factor: 13.491
Authors: Colin M Cleary; Brenda M Milla; Fu-Shan Kuo; Shaun James; William F Flynn; Paul Robson; Daniel K Mulkey Journal: Elife Date: 2021-05-20 Impact factor: 8.140
Authors: Yufeng Lu; Fion Shiau; Wenyang Yi; Suying Lu; Qian Wu; Joel D Pearson; Alyssa Kallman; Suijuan Zhong; Thanh Hoang; Zhentao Zuo; Fangqi Zhao; Mei Zhang; Nicole Tsai; Yan Zhuo; Sheng He; Jun Zhang; Genevieve L Stein-O'Brien; Thomas D Sherman; Xin Duan; Elana J Fertig; Loyal A Goff; Donald J Zack; James T Handa; Tian Xue; Rod Bremner; Seth Blackshaw; Xiaoqun Wang; Brian S Clark Journal: Dev Cell Date: 2020-05-07 Impact factor: 12.270
Authors: Emily M Holloway; Michael Czerwinski; Yu-Hwai Tsai; Joshua H Wu; Angeline Wu; Charlie J Childs; Katherine D Walton; Caden W Sweet; Qianhui Yu; Ian Glass; Barbara Treutlein; J Gray Camp; Jason R Spence Journal: Cell Stem Cell Date: 2020-12-04 Impact factor: 24.633
Authors: Branca I Pereira; Roel P H De Maeyer; Luciana P Covre; Djamel Nehar-Belaid; Alessio Lanna; Sophie Ward; Radu Marches; Emma S Chambers; Daniel C O Gomes; Natalie E Riddell; Mala K Maini; Vitor H Teixeira; Samuel M Janes; Derek W Gilroy; Anis Larbi; Neil A Mabbott; Duygu Ucar; George A Kuchel; Sian M Henson; Jessica Strid; Jun H Lee; Jacques Banchereau; Arne N Akbar Journal: Nat Immunol Date: 2020-03-30 Impact factor: 25.606
Authors: Jong-Eun Park; Rachel A Botting; Cecilia Domínguez Conde; Dorin-Mirel Popescu; Marieke Lavaert; Daniel J Kunz; Issac Goh; Emily Stephenson; Roberta Ragazzini; Elizabeth Tuck; Anna Wilbrey-Clark; Kenny Roberts; Veronika R Kedlian; John R Ferdinand; Xiaoling He; Simone Webb; Daniel Maunder; Niels Vandamme; Krishnaa T Mahbubani; Krzysztof Polanski; Lira Mamanova; Liam Bolt; David Crossland; Fabrizio de Rita; Andrew Fuller; Andrew Filby; Gary Reynolds; David Dixon; Kourosh Saeb-Parsy; Steven Lisgo; Deborah Henderson; Roser Vento-Tormo; Omer A Bayraktar; Roger A Barker; Kerstin B Meyer; Yvan Saeys; Paola Bonfanti; Sam Behjati; Menna R Clatworthy; Tom Taghon; Muzlifah Haniffa; Sarah A Teichmann Journal: Science Date: 2020-02-21 Impact factor: 47.728
Authors: Ni Huang; Paola Pérez; Takafumi Kato; Yu Mikami; Kenichi Okuda; Rodney C Gilmore; Cecilia Domínguez Conde; Billel Gasmi; Sydney Stein; Margaret Beach; Eileen Pelayo; Jose O Maldonado; Bernard A Lafont; Shyh-Ing Jang; Nadia Nasir; Ricardo J Padilla; Valerie A Murrah; Robert Maile; William Lovell; Shannon M Wallet; Natalie M Bowman; Suzanne L Meinig; Matthew C Wolfgang; Saibyasachi N Choudhury; Mark Novotny; Brian D Aevermann; Richard H Scheuermann; Gabrielle Cannon; Carlton W Anderson; Rhianna E Lee; Julie T Marchesan; Mandy Bush; Marcelo Freire; Adam J Kimple; Daniel L Herr; Joseph Rabin; Alison Grazioli; Sanchita Das; Benjamin N French; Thomas Pranzatelli; John A Chiorini; David E Kleiner; Stefania Pittaluga; Stephen M Hewitt; Peter D Burbelo; Daniel Chertow; Karen Frank; Janice Lee; Richard C Boucher; Sarah A Teichmann; Blake M Warner; Kevin M Byrd Journal: Nat Med Date: 2021-03-25 Impact factor: 53.440
Authors: Yuchen Yang; Gang Li; Yifang Xie; Li Wang; Taylor M Lagler; Yingxi Yang; Jiandong Liu; Li Qian; Yun Li Journal: Brief Bioinform Date: 2021-09-02 Impact factor: 11.622