Xiaoling Chen1, Jeffrey T Chang1,2,3. 1. School of Biomedical Informatics. 2. Department of Integrative Biology & Pharmacology, University of Texas Health Science Center at Houston, Houston, TX 77030, USA. 3. Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Abstract
Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab. Contact: jeffrey.t.chang@uth.tmc.edu.
Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab. Contact: jeffrey.t.chang@uth.tmc.edu.
Authors: Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo Journal: Genome Res Date: 2010-07-19 Impact factor: 9.043
Authors: Michael L Gatza; Joseph E Lucas; William T Barry; Jong Wook Kim; Quanli Wang; Matthew D Crawford; Michael B Datto; Michael Kelley; Bernard Mathey-Prevot; Anil Potti; Joseph R Nevins Journal: Proc Natl Acad Sci U S A Date: 2010-03-24 Impact factor: 11.205
Authors: Nancy Van Driessche; Janez Demsar; Ezgi O Booth; Paul Hill; Peter Juvan; Blaz Zupan; Adam Kuspa; Gad Shaulsky Journal: Nat Genet Date: 2005-04-10 Impact factor: 38.330
Authors: John P A Ioannidis; David B Allison; Catherine A Ball; Issa Coulibaly; Xiangqin Cui; Aedín C Culhane; Mario Falchi; Cesare Furlanello; Laurence Game; Giuseppe Jurman; Jon Mangion; Tapan Mehta; Michael Nitzberg; Grier P Page; Enrico Petretto; Vera van Noort Journal: Nat Genet Date: 2008-01-28 Impact factor: 38.330
Authors: Jeffrey T Chang; Michael L Gatza; Joseph E Lucas; William T Barry; Peyton Vaughn; Joseph R Nevins Journal: BMC Bioinformatics Date: 2011-11-14 Impact factor: 3.169
Authors: Anne Bruun Krøigård; Mads Thomassen; Anne-Vibeke Lænkholm; Torben A Kruse; Martin Jakob Larsen Journal: PLoS One Date: 2016-03-22 Impact factor: 3.240
Authors: D J McGrail; P G Pilié; N U Rashid; L Voorwerk; M Slagter; M Kok; E Jonasch; M Khasraw; A B Heimberger; B Lim; N T Ueno; J K Litton; R Ferrarotto; J T Chang; S L Moulder; S-Y Lin Journal: Ann Oncol Date: 2021-03-15 Impact factor: 32.976
Authors: Clinton Yam; Nour Abuhadra; Ryan Sun; Beatriz E Adrada; Qing-Qing Ding; Jason B White; Elizabeth E Ravenberg; Alyson R Clayborn; Vicente Valero; Debu Tripathy; Senthilkumar Damodaran; Banu K Arun; Jennifer K Litton; Naoto T Ueno; Rashmi K Murthy; Bora Lim; Luis Baez; Xiaoxian Li; Aman U Buzdar; Gabriel N Hortobagyi; Alistair M Thompson; Elizabeth A Mittendorf; Gaiane M Rauch; Rosalind P Candelaria; Lei Huo; Stacy L Moulder; Jeffrey T Chang Journal: Clin Cancer Res Date: 2022-07-01 Impact factor: 13.801
Authors: Xuan Liu; Zhongqi Ge; Fei Yang; Alejandro Contreras; Sanghoon Lee; Jason B White; Yiling Lu; Marilyne Labrie; Banu K Arun; Stacy L Moulder; Gordon B Mills; Helen Piwnica-Worms; Jennifer K Litton; Jeffrey T Chang Journal: NPJ Breast Cancer Date: 2022-05-10
Authors: Shunan Liu; Yanyan Song; Ian Y Zhang; Leying Zhang; Hang Gao; Yanping Su; Yihang Yang; Shi Yin; Yawen Zheng; Lyuzhi Ren; Hongwei Holly Yin; Raju Pillai; Aritro Nath; Eric F Medina; Patrick A Cosgrove; Andrea H Bild; Behnam Badie Journal: Neurotherapeutics Date: 2022-02-28 Impact factor: 6.088
Authors: Clinton Yam; Er-Yen Yen; Jeffrey T Chang; Roland L Bassett; Gheath Alatrash; Haven Garber; Lei Huo; Fei Yang; Anne V Philips; Qing-Qing Ding; Bora Lim; Naoto T Ueno; Kasthuri Kannan; Xiangjie Sun; Baohua Sun; Edwin Roger Parra Cuentas; William Fraser Symmans; Jason B White; Elizabeth Ravenberg; Sahil Seth; Jennifer L Guerriero; Gaiane M Rauch; Senthil Damodaran; Jennifer K Litton; Jennifer A Wargo; Gabriel N Hortobagyi; Andrew Futreal; Ignacio I Wistuba; Ryan Sun; Stacy L Moulder; Elizabeth A Mittendorf Journal: Clin Cancer Res Date: 2021-10-01 Impact factor: 12.531
Authors: Xuan Liu; Sara J C Gosline; Lance T Pflieger; Pierre Wallet; Archana Iyer; Justin Guinney; Andrea H Bild; Jeffrey T Chang Journal: Brief Bioinform Date: 2021-09-02 Impact factor: 11.622
Authors: Yinjie Zhang; Baibing Yang; Joy M Davis; Madeline M Drake; Mamoun Younes; Qiang Shen; Zhongming Zhao; Yanna Cao; Tien C Ko Journal: Mediators Inflamm Date: 2021-05-15 Impact factor: 4.711