Ali Salari1, Gregory Kiar2,3, Lindsay Lewis2, Alan C Evans2,3, Tristan Glatard1. 1. Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada. 2. Department of Biomedical Engineering, McGill University, Montreal, QC, Canada. 3. Montreal Neurological Institute, McGill University, Montreal, QC, Canada.
Abstract
BACKGROUND: Data analysis pipelines are known to be affected by computational conditions, presumably owing to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear. METHOD: We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation. RESULTS: By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.
BACKGROUND: Data analysis pipelines are known to be affected by computational conditions, presumably owing to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear. METHOD: We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation. RESULTS: By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.
Authors: David N Kennedy; Sanu A Abraham; Julianna F Bates; Albert Crowley; Satrajit Ghosh; Tom Gillespie; Mathias Goncalves; Jeffrey S Grethe; Yaroslav O Halchenko; Michael Hanke; Christian Haselgrove; Steven M Hodge; Dorota Jarecka; Jakub Kaczmarzyk; David B Keator; Kyle Meyer; Maryann E Martone; Smruti Padhy; Jean-Baptiste Poline; Nina Preuss; Troy Sincomb; Matt Travers Journal: Front Neuroinform Date: 2019-02-07 Impact factor: 4.081
Authors: Mark Jenkinson; Christian F Beckmann; Timothy E J Behrens; Mark W Woolrich; Stephen M Smith Journal: Neuroimage Date: 2011-09-16 Impact factor: 6.556
Authors: Ed H B M Gronenschild; Petra Habets; Heidi I L Jacobs; Ron Mengelers; Nico Rozendaal; Jim van Os; Machteld Marcelis Journal: PLoS One Date: 2012-06-01 Impact factor: 3.240
Authors: Russell A Poldrack; Krzysztof J Gorgolewski; Oscar Esteban; Christopher J Markiewicz; Ross W Blair; Craig A Moodie; A Ilkay Isik; Asier Erramuzpe; James D Kent; Mathias Goncalves; Elizabeth DuPre; Madeleine Snyder; Hiroyuki Oya; Satrajit S Ghosh; Jessey Wright; Joke Durnez Journal: Nat Methods Date: 2018-12-10 Impact factor: 28.547
Authors: Rotem Botvinik-Nezer; Felix Holzmeister; Colin F Camerer; Anna Dreber; Juergen Huber; Magnus Johannesson; Michael Kirchler; Roni Iwanir; Jeanette A Mumford; R Alison Adcock; Paolo Avesani; Blazej M Baczkowski; Aahana Bajracharya; Leah Bakst; Sheryl Ball; Marco Barilari; Nadège Bault; Derek Beaton; Julia Beitner; Roland G Benoit; Ruud M W J Berkers; Jamil P Bhanji; Bharat B Biswal; Sebastian Bobadilla-Suarez; Tiago Bortolini; Katherine L Bottenhorn; Alexander Bowring; Senne Braem; Hayley R Brooks; Emily G Brudner; Cristian B Calderon; Julia A Camilleri; Jaime J Castrellon; Luca Cecchetti; Edna C Cieslik; Zachary J Cole; Olivier Collignon; Robert W Cox; William A Cunningham; Stefan Czoschke; Kamalaker Dadi; Charles P Davis; Alberto De Luca; Mauricio R Delgado; Lysia Demetriou; Jeffrey B Dennison; Xin Di; Erin W Dickie; Ekaterina Dobryakova; Claire L Donnat; Juergen Dukart; Niall W Duncan; Joke Durnez; Amr Eed; Simon B Eickhoff; Andrew Erhart; Laura Fontanesi; G Matthew Fricke; Shiguang Fu; Adriana Galván; Remi Gau; Sarah Genon; Tristan Glatard; Enrico Glerean; Jelle J Goeman; Sergej A E Golowin; Carlos González-García; Krzysztof J Gorgolewski; Cheryl L Grady; Mikella A Green; João F Guassi Moreira; Olivia Guest; Shabnam Hakimi; J Paul Hamilton; Roeland Hancock; Giacomo Handjaras; Bronson B Harry; Colin Hawco; Peer Herholz; Gabrielle Herman; Stephan Heunis; Felix Hoffstaedter; Jeremy Hogeveen; Susan Holmes; Chuan-Peng Hu; Scott A Huettel; Matthew E Hughes; Vittorio Iacovella; Alexandru D Iordan; Peder M Isager; Ayse I Isik; Andrew Jahn; Matthew R Johnson; Tom Johnstone; Michael J E Joseph; Anthony C Juliano; Joseph W Kable; Michalis Kassinopoulos; Cemal Koba; Xiang-Zhen Kong; Timothy R Koscik; Nuri Erkut Kucukboyaci; Brice A Kuhl; Sebastian Kupek; Angela R Laird; Claus Lamm; Robert Langner; Nina Lauharatanahirun; Hongmi Lee; Sangil Lee; Alexander Leemans; Andrea Leo; Elise Lesage; Flora Li; Monica Y C Li; Phui Cheng Lim; Evan N Lintz; Schuyler W Liphardt; Annabel B Losecaat Vermeer; Bradley C Love; Michael L Mack; Norberto Malpica; Theo Marins; Camille Maumet; Kelsey McDonald; Joseph T McGuire; Helena Melero; Adriana S Méndez Leal; Benjamin Meyer; Kristin N Meyer; Glad Mihai; Georgios D Mitsis; Jorge Moll; Dylan M Nielson; Gustav Nilsonne; Michael P Notter; Emanuele Olivetti; Adrian I Onicas; Paolo Papale; Kaustubh R Patil; Jonathan E Peelle; Alexandre Pérez; Doris Pischedda; Jean-Baptiste Poline; Yanina Prystauka; Shruti Ray; Patricia A Reuter-Lorenz; Richard C Reynolds; Emiliano Ricciardi; Jenny R Rieck; Anais M Rodriguez-Thompson; Anthony Romyn; Taylor Salo; Gregory R Samanez-Larkin; Emilio Sanz-Morales; Margaret L Schlichting; Douglas H Schultz; Qiang Shen; Margaret A Sheridan; Jennifer A Silvers; Kenny Skagerlund; Alec Smith; David V Smith; Peter Sokol-Hessner; Simon R Steinkamp; Sarah M Tashjian; Bertrand Thirion; John N Thorp; Gustav Tinghög; Loreen Tisdall; Steven H Tompson; Claudio Toro-Serey; Juan Jesus Torre Tresols; Leonardo Tozzi; Vuong Truong; Luca Turella; Anna E van 't Veer; Tom Verguts; Jean M Vettel; Sagana Vijayarajah; Khoi Vo; Matthew B Wall; Wouter D Weeda; Susanne Weis; David J White; David Wisniewski; Alba Xifra-Porxas; Emily A Yearling; Sangsuk Yoon; Rui Yuan; Kenneth S L Yuen; Lei Zhang; Xu Zhang; Joshua E Zosky; Thomas E Nichols; Russell A Poldrack; Tom Schonberg Journal: Nature Date: 2020-05-20 Impact factor: 69.504
Authors: Pauli Virtanen; Ralf Gommers; Travis E Oliphant; Matt Haberland; Tyler Reddy; David Cournapeau; Evgeni Burovski; Pearu Peterson; Warren Weckesser; Jonathan Bright; Stéfan J van der Walt; Matthew Brett; Joshua Wilson; K Jarrod Millman; Nikolay Mayorov; Andrew R J Nelson; Eric Jones; Robert Kern; Eric Larson; C J Carey; İlhan Polat; Yu Feng; Eric W Moore; Jake VanderPlas; Denis Laxalde; Josef Perktold; Robert Cimrman; Ian Henriksen; E A Quintero; Charles R Harris; Anne M Archibald; Antônio H Ribeiro; Fabian Pedregosa; Paul van Mulbregt Journal: Nat Methods Date: 2020-02-03 Impact factor: 28.547
Authors: Gregory Kiar; Yohan Chatelain; Pablo de Oliveira Castro; Eric Petit; Ariel Rokem; Gaël Varoquaux; Bratislav Misic; Alan C Evans; Tristan Glatard Journal: PLoS One Date: 2021-11-01 Impact factor: 3.240