Literature DB >> 27882299

Application of target capture sequencing of exons and conserved non-coding sequences to 20 inbred rat strains.

Minako Yoshihara1, Tetsuya Sato1, Daisuke Saito1, Osamu Ohara2, Takashi Kuramoto3, Mikita Suyama1.   

Abstract

We report sequence data obtained by our recently devised target capture method TargetEC applied to 20 inbred rat strains. This method encompasses not only all annotated exons but also highly conserved non-coding sequences shared among vertebrates. The total length of the target regions covers 146.8 Mb. On an average, we obtained 31.7 × depth of target coverage and identified 154,330 SNVs and 24,368 INDELs for each strain. This corresponds to 470,037 unique SNVs and 68,652 unique INDELs among the 20 strains. The sequence data can be accessed at DDBJ/EMBL/GenBank under accession number PRJDB4648, and the identified variants have been deposited at http://bioinfo.sls.kyushu-u.ac.jp/rat_target_capture/20_strains.vcf.gz.

Entities:  

Year:  2016        PMID: 27882299      PMCID: PMC5114524          DOI: 10.1016/j.gdata.2016.11.010

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Direct link to deposited data [provide URL below]

http://www.ncbi.nlm.nih.gov/bioproject/?term=PRJDB4648 http://bioinfo.sls.kyushu-u.ac.jp/rat_target_capture/20_strains.vcf.zip

Experimental design, materials and methods

Rats are used as animal models of many human diseases, such as cancer and hypertension. Because of its significance in biomedical analyses, the genome sequence of the Brown Norway rat strain was determined as the third complete mammalian genome [1]. The National BioResource Project–Rat (NBRP-Rat) at Kyoto University is one of the largest repositories for rat strains, and currently, > 700 strains have been collected and preserved as live animals, embryos, or sperm [2]. Determination of genome sequences for these strains is important not only for understanding genetic causes for various phenotypes but also to augment their value as biological resources. Whole exome sequencing is an efficient approach to characterize only the exonic portions of a genome, which typically comprise 1%–2% of complete mammalian genomes, and has been successfully used in the identification of relevant genes and their causative mutations in many diseases in humans. Although some non-human exome capture kits exist, there had previously been no such capture probe set for rats. Therefore, we established a target capture kit specifically designed for this rodent species, employing the SeqCap EZ Developer Library (Roche NimbleGen, Madison, WI, USA; design name 140929_RN5_MS_EZ_HX1). In designing our target capture probe set, we included highly conserved non-coding sequences (CNSs) as target regions as well as all annotated exons, covering a total 146.8 Mb of the genome [3]. By applying this target capture method TargetEC (target capture for exons and conserved non-coding sequences) to four rat strains (WTC/Kyo, WTC-swh/Kyo, PVG/Seac, and KFRS4/Kyo), we confirmed that TargetEC performs efficiently in the identification of causative mutations, including those present in the non-coding regions [3]. In this study, we further applied TargetEC to 20 additional inbred strains preserved in NBRP-Rat to identify additional variants observed in multiple rat strains. These 20 strains were selected according to the following three categories: disease models derived from selective breeding (BDIX/NemOda, BDIX.Cg-Tal/NemOda, BUF/MNa, HTX/Kyo, HWY/Slc, KFRS3B/Kyo, RCS/Kyo, ZF, and ZFDM), those originated from wild populations (BN/SsNSlc, DOB/Oda, IS/Kyo, IS-Tlk/Kyo, LE/Stm, LEC/Tj, and NIG-III/Hok), and representative inbred strains (F344/DuCrlCrlj, F344/Jcl, F344/NSlc, and F344/Stm). All animal experimentation protocols were approved by the Institutional Animal Care and Use Committees of Kyoto University and were conducted according to the Regulation on Animal Experimentation at Kyoto University. Genomic DNA was extracted from spleen samples with standard protocols. Target capture was performed using the standard SeqCap EZ System protocol (Roche NimbleGen). DNA sequencing libraries were prepared using the KAPA HyperPlus Library Preparation Kit (KAPA Biosystems, London, UK) according to the manufacturer's protocol. Sequencing was performed on an Illumina NextSeq 500 platform (Illumina, San Diego, CA, USA) using the High Output Kit (2 × 150 cycles). We obtained 61–82 million reads for each strain (Table 1). Sequence reads were mapped to the rat genome assembly rn5 (RGSC 5.0, March 2012) using BWA (v0.7.4) [4] with the default parameters. SAMtools (v0.1.12a) [5], Picard tools (v1.87) (http://broadinstitute.github.io/picard/), and the Genome Analysis Toolkit (GATK; v2.5.2) [6] were used for post-processing of mapped reads. Variant calling employed the UnifiedGenotyper utility in GATK. We identified 154,330 SNVs and 24,368 INDELs in the target regions, on an average (Table 1). The number of unique SNVs and INDELs among the 20 strains was 470,037 and 68,652, respectively. Sequence data and variants identified for these strains represent valuable resources for further genetic studies in the rat.
Table 1

Summary statistics for sequencing and variant calling.

StrainSexTotal readsRead lengthMapped reads after post-processing (%)Average target depthSNV(depth ≥ 5 ×)INDEL(depth ≥ 5 ×)
BDIX.Cg-Tal/NemOdaUnknown77,031,19215162,133,380 (80.7)33.0161,04325,729
BDIX/NemOdaFemale62,884,34015150,668,261 (80.6)26.2155,72724,561
BN/SsNSlcMale67,363,47815154,385,129 (80.7)29.423,0605533
BUF/MNaMale60,898,02015149,603,905 (81.5)27.2154,38224,122
DOB/OdaMale68,359,82015161,010,641 (89.2)31.7196,75130,148
F344/DuCrlCrljMale73,516,66015159,541,186 (81.0)27.7152,18423,890
F344/JclMale62,994,07215150,991,611 (80.9)26.6152,14123,855
F344/NSlcMale62,838,17015150,726,936 (80.7)27.5152,54623,930
F344/StmMale64,788,90815152,984,127 (81.8)29.1151,91923,735
HTX/KyoMale72,484,64015164,572,821 (89.1)33.7154,41824,156
HWY/SlcMale74,687,03415166,579,903 (89.1)34.6157,07024,873
IS/KyoMale79,430,34415170,744,396 (89.1)37.4187,30029,120
IS-Tlk/KyoMale75,990,09215167,761,875 (89.2)35.8186,64828,902
KFRS3B/KyoFemale81,643,13415172,603,786 (88.9)35.1154,29224,419
LE/StmMale72,300,09415158,438,239 (80.8)31.7157,48825,052
LEC/TjUnknown78,990,27215170,539,682 (89.3)37.3167,54726,315
NIG-III/HokUnknown78,128,62415169,625,354 (89.1)36.9164,73226,195
RCS/KyoMale71,627,64815157,894,324 (80.8)31.6155,47224,975
ZFMale69,986,46615156,655,891 (81.0)30.2150,77823,815
ZFDMMale73,535,06015159,407,086 (80.8)31.9151,10124,025

Conflict of interest

The authors declare no conflicts of interest.
Specifications [standardized info for the reader]
Organism/cell line/tissueRattus norvegicus (BDIX/NemOda, BDIX. Cg-Tal/NemOda, BN/SsNSlc, BUF/MNa, DOB/Oda, F344/DuCrlCrlj, F344/Jcl, F344/NSlc, F344/Stm, HTX/Kyo, HWY/Slc, IS/Kyo, IS-Tlk/Kyo, KFRS3B/Kyo, LE/Stm, LEC/Tj, NIG-III/Hok, RCS/Kyo, ZF, ZFDM)

SexFemale and male, see Table 1
Sequencer or array typeIllumina NextSeq 500
Data formatFASTQ and VCF
Experimental factorsGenomic DNA extracted from spleen
Experimental featuresTarget capture sequencing of exons and conserved non-coding sequences
ConsentNot applicable
Sample source locationRat strains were provided by the National BioResource Project (NBRP)–Rat (http://www.anim.med.kyoto-u.ac.jp/nbr/).
  6 in total

1.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Authors:  Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo
Journal:  Genome Res       Date:  2010-07-19       Impact factor: 9.043

2.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

3.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution.

Authors:  Richard A Gibbs; George M Weinstock; Michael L Metzker; Donna M Muzny; Erica J Sodergren; Steven Scherer; Graham Scott; David Steffen; Kim C Worley; Paula E Burch; Geoffrey Okwuonu; Sandra Hines; Lora Lewis; Christine DeRamo; Oliver Delgado; Shannon Dugan-Rocha; George Miner; Margaret Morgan; Alicia Hawes; Rachel Gill; Robert A Holt; Mark D Adams; Peter G Amanatides; Holly Baden-Tillson; Mary Barnstead; Soo Chin; Cheryl A Evans; Steve Ferriera; Carl Fosler; Anna Glodek; Zhiping Gu; Don Jennings; Cheryl L Kraft; Trixie Nguyen; Cynthia M Pfannkoch; Cynthia Sitter; Granger G Sutton; J Craig Venter; Trevor Woodage; Douglas Smith; Hong-Mei Lee; Erik Gustafson; Patrick Cahill; Arnold Kana; Lynn Doucette-Stamm; Keith Weinstock; Kim Fechtel; Robert B Weiss; Diane M Dunn; Eric D Green; Robert W Blakesley; Gerard G Bouffard; Pieter J De Jong; Kazutoyo Osoegawa; Baoli Zhu; Marco Marra; Jacqueline Schein; Ian Bosdet; Chris Fjell; Steven Jones; Martin Krzywinski; Carrie Mathewson; Asim Siddiqui; Natasja Wye; John McPherson; Shaying Zhao; Claire M Fraser; Jyoti Shetty; Sofiya Shatsman; Keita Geer; Yixin Chen; Sofyia Abramzon; William C Nierman; Paul H Havlak; Rui Chen; K James Durbin; Amy Egan; Yanru Ren; Xing-Zhi Song; Bingshan Li; Yue Liu; Xiang Qin; Simon Cawley; Kim C Worley; A J Cooney; Lisa M D'Souza; Kirt Martin; Jia Qian Wu; Manuel L Gonzalez-Garay; Andrew R Jackson; Kenneth J Kalafus; Michael P McLeod; Aleksandar Milosavljevic; Davinder Virk; Andrei Volkov; David A Wheeler; Zhengdong Zhang; Jeffrey A Bailey; Evan E Eichler; Eray Tuzun; Ewan Birney; Emmanuel Mongin; Abel Ureta-Vidal; Cara Woodwark; Evgeny Zdobnov; Peer Bork; Mikita Suyama; David Torrents; Marina Alexandersson; Barbara J Trask; Janet M Young; Hui Huang; Huajun Wang; Heming Xing; Sue Daniels; Darryl Gietzen; Jeanette Schmidt; Kristian Stevens; Ursula Vitt; Jim Wingrove; Francisco Camara; M Mar Albà; Josep F Abril; Roderic Guigo; Arian Smit; Inna Dubchak; Edward M Rubin; Olivier Couronne; Alexander Poliakov; Norbert Hübner; Detlev Ganten; Claudia Goesele; Oliver Hummel; Thomas Kreitler; Young-Ae Lee; Jan Monti; Herbert Schulz; Heike Zimdahl; Heinz Himmelbauer; Hans Lehrach; Howard J Jacob; Susan Bromberg; Jo Gullings-Handley; Michael I Jensen-Seaman; Anne E Kwitek; Jozef Lazar; Dean Pasko; Peter J Tonellato; Simon Twigger; Chris P Ponting; Jose M Duarte; Stephen Rice; Leo Goodstadt; Scott A Beatson; Richard D Emes; Eitan E Winter; Caleb Webber; Petra Brandt; Gerald Nyakatura; Margaret Adetobi; Francesca Chiaromonte; Laura Elnitski; Pallavi Eswara; Ross C Hardison; Minmei Hou; Diana Kolbe; Kateryna Makova; Webb Miller; Anton Nekrutenko; Cathy Riemer; Scott Schwartz; James Taylor; Shan Yang; Yi Zhang; Klaus Lindpaintner; T Dan Andrews; Mario Caccamo; Michele Clamp; Laura Clarke; Valerie Curwen; Richard Durbin; Eduardo Eyras; Stephen M Searle; Gregory M Cooper; Serafim Batzoglou; Michael Brudno; Arend Sidow; Eric A Stone; J Craig Venter; Bret A Payseur; Guillaume Bourque; Carlos López-Otín; Xose S Puente; Kushal Chakrabarti; Sourav Chatterji; Colin Dewey; Lior Pachter; Nicolas Bray; Von Bing Yap; Anat Caspi; Glenn Tesler; Pavel A Pevzner; David Haussler; Krishna M Roskin; Robert Baertsch; Hiram Clawson; Terrence S Furey; Angie S Hinrichs; Donna Karolchik; William J Kent; Kate R Rosenbloom; Heather Trumbower; Matt Weirauch; David N Cooper; Peter D Stenson; Bin Ma; Michael Brent; Manimozhiyan Arumugam; David Shteynberg; Richard R Copley; Martin S Taylor; Harold Riethman; Uma Mudunuri; Jane Peterson; Mark Guyer; Adam Felsenfeld; Susan Old; Stephen Mockrin; Francis Collins
Journal:  Nature       Date:  2004-04-01       Impact factor: 49.962

Review 4.  National BioResource Project-Rat and related activities.

Authors:  Tadao Serikawa; Tomoji Mashimo; Akiko Takizawa; Ryoko Okajima; Naoki Maedomari; Kenta Kumafuji; Fumi Tagami; Yuki Neoda; Mito Otsuki; Satoshi Nakanishi; Ken-ichi Yamasaki; Birger Voigt; Takashi Kuramoto
Journal:  Exp Anim       Date:  2009-07

5.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

6.  Design and application of a target capture sequencing of exons and conserved non-coding sequences for the rat.

Authors:  Minako Yoshihara; Daisuke Saito; Tetsuya Sato; Osamu Ohara; Takashi Kuramoto; Mikita Suyama
Journal:  BMC Genomics       Date:  2016-08-09       Impact factor: 3.969

  6 in total
  2 in total

1.  A deletion in the intergenic region upstream of Ednrb causes head spot in the rat strain KFRS4/Kyo.

Authors:  Minako Yoshihara; Tetsuya Sato; Daisuke Saito; Osamu Ohara; Takashi Kuramoto; Mikita Suyama
Journal:  BMC Genet       Date:  2017-03-29       Impact factor: 2.797

2.  Comparative genomic analysis of inbred rat strains reveals the existence of ancestral polymorphisms.

Authors:  Hyeonjeong Kim; Minako Yoshihara; Mikita Suyama
Journal:  Mamm Genome       Date:  2020-03-12       Impact factor: 2.957

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.