Matthew C LaFave1, Gaurav K Varshney1, Shawn M Burgess1. 1. Translational and Functional Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-8004, USA.
Abstract
UNLABELLED: There are several experimental contexts in which it is important to identify DNA integration sites, such as insertional mutagenesis screens, gene and enhancer trap applications, and gene therapy. We previously developed an assay to identify millions of integrations in multiplexed barcoded samples at base-pair resolution. The sheer amount of data produced by this approach makes the mapping of individual sites non-trivial without bioinformatics support. This article presents the Genomic Integration Site Tracker (GeIST), a command-line pipeline designed to map the integration sites produced by this assay and identify the samples from which they came. GeIST version 2.1.0, a more adaptable version of our original pipeline, can identify integrations of murine leukemia virus, adeno-associated virus, Tol2 transposons or Ac/Ds transposons, and can be adapted for other inserted elements. It has been tested on experimental data for each of these delivery vectors and fine-tuned to account for sequencing and cloning artifacts. AVAILABILITY AND IMPLEMENTATION: GeIST uses a combination of Bash shell scripting and Perl. GeIST is available at http://research.nhgri.nih.gov/software/GeIST/. CONTACT: burgess@mail.nih.gov. Published by Oxford University Press 2015. This work is written by US Government employees and is in the public domain in the US.
UNLABELLED: There are several experimental contexts in which it is important to identify DNA integration sites, such as insertional mutagenesis screens, gene and enhancer trap applications, and gene therapy. We previously developed an assay to identify millions of integrations in multiplexed barcoded samples at base-pair resolution. The sheer amount of data produced by this approach makes the mapping of individual sites non-trivial without bioinformatics support. This article presents the Genomic Integration Site Tracker (GeIST), a command-line pipeline designed to map the integration sites produced by this assay and identify the samples from which they came. GeIST version 2.1.0, a more adaptable version of our original pipeline, can identify integrations of murine leukemia virus, adeno-associated virus, Tol2 transposons or Ac/Ds transposons, and can be adapted for other inserted elements. It has been tested on experimental data for each of these delivery vectors and fine-tuned to account for sequencing and cloning artifacts. AVAILABILITY AND IMPLEMENTATION: GeIST uses a combination of Bash shell scripting and Perl. GeIST is available at http://research.nhgri.nih.gov/software/GeIST/. CONTACT: burgess@mail.nih.gov. Published by Oxford University Press 2015. This work is written by US Government employees and is in the public domain in the US.
Authors: Derek W Barnett; Erik K Garrison; Aaron R Quinlan; Michael P Strömberg; Gabor T Marth Journal: Bioinformatics Date: 2011-04-14 Impact factor: 6.937
Authors: Nancy A Jenkins; Neal G Copeland; Emilie A Bard-Chapeau; Anh-Tuan Nguyen; Alistair G Rust; Ahmed Sayadi; Philip Lee; Belinda Q Chua; Lee-Sun New; Johann de Jong; Jerrold M Ward; Christopher Ky Chin; Valerie Chew; Han Chong Toh; Jean-Pierre Abastado; Touati Benoukraf; Richie Soong; Frederic A Bard; Adam J Dupuy; Randy L Johnson; George K Radda; Eric Cy Chan; Lodewyk Fa Wessels; David J Adams Journal: Nat Genet Date: 2013-12-08 Impact factor: 38.330
Authors: Karl J Clark; Darius Balciunas; Hans-Martin Pogoda; Yonghe Ding; Stephanie E Westcot; Victoria M Bedell; Tammy M Greenwood; Mark D Urban; Kimberly J Skuster; Andrew M Petzold; Jun Ni; Aubrey L Nielsen; Ashok Patowary; Vinod Scaria; Sridhar Sivasubbu; Xiaolei Xu; Matthias Hammerschmidt; Stephen C Ekker Journal: Nat Methods Date: 2011-05-08 Impact factor: 28.547
Authors: Gaurav K Varshney; Jing Lu; Derek E Gildea; Haigen Huang; Wuhong Pei; Zhongan Yang; Sunny C Huang; David Schoenfeld; Nam H Pho; David Casero; Takashi Hirase; Deborah Mosbrook-Davis; Suiyuan Zhang; Li-En Jao; Bo Zhang; Ian G Woods; Steven Zimmerman; Alexander F Schier; Tyra G Wolfsberg; Matteo Pellegrini; Shawn M Burgess; Shuo Lin Journal: Genome Res Date: 2013-02-04 Impact factor: 9.043
Authors: Matthew C LaFave; Gaurav K Varshney; Derek E Gildea; Tyra G Wolfsberg; Andreas D Baxevanis; Shawn M Burgess Journal: Nucleic Acids Res Date: 2014-01-23 Impact factor: 16.971
Authors: Charles C Berry; Christopher Nobles; Emmanuelle Six; Yinghua Wu; Nirav Malani; Eric Sherman; Anatoly Dryga; John K Everett; Frances Male; Aubrey Bailey; Kyle Bittinger; Mary J Drake; Laure Caccavelli; Paul Bates; Salima Hacein-Bey-Abina; Marina Cavazzana; Frederic D Bushman Journal: Mol Ther Methods Clin Dev Date: 2016-12-18 Impact factor: 6.698
Authors: Rita Ferla; Marialuisa Alliegro; Margherita Dell'Anno; Edoardo Nusco; John M Cullen; Stephanie N Smith; Tyra G Wolfsberg; Patricia O'Donnell; Ping Wang; Anh-Dao Nguyen; Randy J Chandler; Zelin Chen; Shawn M Burgess; Charles H Vite; Mark E Haskins; Charles P Venditti; Alberto Auricchio Journal: Mol Ther Methods Clin Dev Date: 2020-11-26 Impact factor: 5.849
Authors: Gregory D Marquart; Kathryn M Tabor; Mary Brown; Jennifer L Strykowski; Gaurav K Varshney; Matthew C LaFave; Thomas Mueller; Shawn M Burgess; Shin-Ichi Higashijima; Harold A Burgess Journal: Front Neural Circuits Date: 2015-11-24 Impact factor: 3.492