Literature DB >> 23136506

DaizuBase, an integrated soybean genome database including BAC-based physical maps.

Yuichi Katayose¹, Hiroyuki Kanamori, Michihiko Shimomura, Hajime Ohyanagi, Hiroshi Ikawa, Hiroshi Minami, Michie Shibata, Tomoko Ito, Kanako Kurita, Kazue Ito, Yasutaka Tsubokura, Akito Kaga, Jianzhong Wu, Takashi Matsumoto, Kyuya Harada, Takuji Sasaki.

Abstract

Soybean [Glycine max (L) Merrill] is one of the most important leguminous crops and ranks fourth after to rice, wheat and maize in terms of world crop production. Soybean contains abundant protein and oil, which makes it a major source of nutritious food, livestock feed and industrial products. In Japan, soybean is also an important source of traditional staples such as tofu, natto, miso and soy sauce. The soybean genome was determined in 2010. With its enormous size, physical mapping and genome sequencing are the most effective approaches towards understanding the structure and function of the soybean genome. We constructed bacterial artificial chromosome (BAC) libraries from the Japanese soybean cultivar, Enrei. The end-sequences of approximately 100,000 BAC clones were analyzed and used for construction of a BAC-based physical map of the genome. BLAST analysis between Enrei BAC-end sequences and the Williams82 genome was carried out to increase the saturation of the map. This physical map will be used to characterize the genome structure of Japanese soybean cultivars, to develop methods for the isolation of agronomically important genes and to facilitate comparative soybean genome research. The current status of physical mapping of the soybean genome and construction of database are presented.

Entities: Chemical Disease Species

Keywords: BAC-end sequencing; database; physical map

Year: 2012 PMID： 23136506 PMCID： PMC3406781 DOI： 10.1270/jsbbs.61.661

Source DB: PubMed Journal: Breed Sci ISSN： 1344-7610 Impact factor: 2.086

Introduction

In 2010, the soybean genome was sequenced and assembled by the Soybean Genome Sequencing Consortium in the USA (Schmutz ). The genome data are available via databases, phytozome (http://www.phytozome.net/soybean) and Soybase (Grant ) (http://soybase.org/). Other soybean genomes were sequenced by a next generation sequencer (Kim , Lam ). Soybase is an essential site and tool for soybean researchers to investigate genetics, molecular biology, breeding and genomics. Although this database is important for soybean research, Williams82 genome data are insufficient for Japanese soybean research. We therefore constructed a genome database from the Japanese cultivar Enrei, a common cultivar in Japan. Enrei was selected to construct the physical map and decode the genome sequence.

BAC library construction

BAC libraries were constructed from nuclear DNA prepared from young leaves of Enrei (Baba ). Two restriction endonucleases, HindIII and MboI, were used for partial digestion of DNA. Partially digested and size-selected DNA (100–180 kb) was ligated into the BAC vector, pIndigoBAC5 (Epicentre Biotechnologies), then transformed into E. coli, ElectroMAX DH10B cells (Life Technologies). We picked up 80,000 clones of HindIII digest, and 100,000 clones of MboI digest, and designated GMJENa as the HindIII digest library and GMJENb as the MboI library. Insert DNAs were 140 and 100 kb for GMJENa and GMJENb libraries, respectively. Each clone was stored in 384-well microplates and kept at −80°C.

End sequencing of BAC clones

Both ends of all clones of GMJENa and 20,000 clones of GMJENb were sequenced by the BigDye Terminator (Life Technologies) method and ABI 3730xl capillary sequencer (Life Technologies) (Katagiri ). The obtained sequence data were analyzed by PhredPhrap software (Ewing and Green 1998, Ewing ). After exclusion of low-quality (Phred <30) bases, the average read-length of BAC-end sequences was 650 bases.

Mapping of BAC clones and construction of physical map

To identify the physical positions of each sequenced clone, end sequences were analyzed by Blastn with the Williams82 genome assembly (Glyma1.09). After sequencing, end-sequenced BAC clones were mapped on each chromosome of the Williams82 genome assembly. Finally, 59361 BAC clones (58997 clones were mapped on 20 chromosomes, 364 clones were mapped on other scaffolds) were mapped on the Williams82 genome and 91% of the genome was covered by Enrei BAC clones (Table 1). We detected differences between Enrei BAC-end sequences and the Williams82 genome assembly. The mismatch rate was 0.2–0.5%, and the deletion rate was less than 0.1% for each chromosome.

Table 1

Statistics of “Enrei” BAC-based physical map base on 20 chromosomes

Chromosome	BAC	BAC contig	Single BAC contig	Total length (bp)	Covered length (bp)	Total gap length (bp)	Cover rate
Gm01 (D1a)	4,110	44	6	55,915,595	53,637,206	2,278,389	96
Gm02 (D1b)	3,179	72	10	51,656,713	46,459,754	5,196,959	90
Gm03 (N)	2,462	62	7	47,781,076	43,124,153	4,656,923	90
Gm04 (C1)	2,882	56	6	49,243,852	45,507,841	3,736,011	92
Gm05 (A1)	2,852	39	4	41,936,504	38,979,257	2,957,247	93
Gm06 (C2)	2,760	64	8	50,722,821	45,066,672	5,656,149	89
Gm07 (M)	2,695	48	7	44,683,157	41,367,378	3,315,779	93
Gm08 (A2)	2,763	56	7	46,995,532	43,208,178	3,787,354	92
Gm09 (K)	3,101	40	3	46,843,750	44,090,053	2,753,697	94
Gm10 (O)	3,077	60	6	50,969,635	45,376,931	5,592,704	89
Gm11 (B1)	2,447	49	4	39,172,790	35,810,276	3,362,514	91
Gm12 (H)	2,430	41	5	40,113,140	35,646,507	4,466,633	89
Gm13 (F)	1,992	70	12	44,408,971	36,659,143	7,749,828	83
Gm14 (B2)	3,774	45	6	49,711,204	45,751,866	3,959,338	92
Gm15 (E)	3,117	49	3	50,939,160	47,368,637	3,570,523	93
Gm16 (J)	2,392	49	11	37,397,385	33,708,594	3,688,791	90
Gm17 (D2)	2,363	55	11	41,906,774	37,992,668	3,914,106	91
Gm18 (G)	3,714	63	3	62,308,140	57,128,821	5,179,319	92
Gm19 (L)	2,994	51	5	50,589,441	46,460,750	4,128,691	92
Gm20 (I)	3,893	45	4	46,773,167	43,610,445	3,162,722	93

Total	58,997	1,058	128	950,068,807	866,955,130	83,113,677	91

BAC clones mapped on other scaffolds are not shown.

BAC number: number of BAC clones mapped on each chromosome.

BAC contig: number of contigs on each chromosome.

Single BAC contig: number of contigs, consists of one BAC clone.

Total length: base-pair of each chromosome.

Covered length: size of BAC-covered regions.

Total gap length: size of no BAC regions.

Cover rate: (covered length)/(total length) × 100 (%).

DaizuBase

We constructed an integrated soybean genome database, DaizuBase (http://daizu.dna.affrc.go.jp). This database consists of Gbrowse, Unified map and blast search. The Gbrowse page shows BAC-based physical map, unified map page shows linkage map and DNA markers, both are based on Williams82 genome assembly. Gbrowse provides a tracking function for DNA sequence, BAC-end, BAC contigs, GC contents, ESTs, full-length cDNAs (Umezawa ), DNA markers (Fig. 1). And also, DaizuBase has a sequence, keyword and position search systems.

Fig. 1

Browsing DaizuBase. A) DaizuBase top page with links to Gbrowse, Unified Map and Blast search. B) Gbrowse shows BAC-based physical map data. C) Unified Map shows relationships among the linkage map, DNA markers and BAC end sequences. D) Sequence search systems using BLAST.

The prospects

Using the Roche/454 next generation sequencer, GS-FLX Titanium (Margulies ), 10 equivalent size of the genome of Japanese soybean cultivar, Enrei, has already been sequenced. After analyzing the data, we will upload genome data for Enrei into DaizuBase. The database will provide SNPs and In/Dels data for Enrei and Williams82 genomes. Enrei genome data will be useful to distinguish domestic soybean genomes and isolate important genes. Furthermore, sequencing of various Japanese cultivar genomes is progressing using the next generation sequencer. These genomic data will be useful for establishing DNA markers for Japanese cultivars.

8 in total

1. Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection.

Authors: Hon-Ming Lam; Xun Xu; Xin Liu; Wenbin Chen; Guohua Yang; Fuk-Ling Wong; Man-Wah Li; Weiming He; Nan Qin; Bo Wang; Jun Li; Min Jian; Jian Wang; Guihua Shao; Jun Wang; Samuel Sai-Ming Sun; Gengyun Zhang
Journal: Nat Genet Date: 2010-11-14 Impact factor: 38.330

2. Genome sequencing in microfabricated high-density picolitre reactors.

Authors: Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg
Journal: Nature Date: 2005-07-31 Impact factor: 49.962

3. Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors: B Ewing; L Hillier; M C Wendl; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

4. Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors: B Ewing; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

Review 5. Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome.

Authors: Moon Young Kim; Sunghoon Lee; Kyujung Van; Tae-Hyung Kim; Soon-Chun Jeong; Ik-Young Choi; Dae-Soo Kim; Yong-Seok Lee; Daeui Park; Jianxin Ma; Woo-Yeon Kim; Byoung-Chul Kim; Sungjin Park; Kyung-A Lee; Dong Hyun Kim; Kil Hyun Kim; Jin Hee Shin; Young Eun Jang; Kyung Do Kim; Wei Xian Liu; Tanapon Chaisan; Yang Jae Kang; Yeong-Ho Lee; Kook-Hyung Kim; Jung-Kyung Moon; Jeremy Schmutz; Scott A Jackson; Jong Bhak; Suk-Ha Lee
Journal: Proc Natl Acad Sci U S A Date: 2010-12-03 Impact factor: 11.205

6. Genome sequence of the palaeopolyploid soybean.

Authors: Jeremy Schmutz; Steven B Cannon; Jessica Schlueter; Jianxin Ma; Therese Mitros; William Nelson; David L Hyten; Qijian Song; Jay J Thelen; Jianlin Cheng; Dong Xu; Uffe Hellsten; Gregory D May; Yeisoo Yu; Tetsuya Sakurai; Taishi Umezawa; Madan K Bhattacharyya; Devinder Sandhu; Babu Valliyodan; Erika Lindquist; Myron Peto; David Grant; Shengqiang Shu; David Goodstein; Kerrie Barry; Montona Futrell-Griggs; Brian Abernathy; Jianchang Du; Zhixi Tian; Liucun Zhu; Navdeep Gill; Trupti Joshi; Marc Libault; Anand Sethuraman; Xue-Cheng Zhang; Kazuo Shinozaki; Henry T Nguyen; Rod A Wing; Perry Cregan; James Specht; Jane Grimwood; Dan Rokhsar; Gary Stacey; Randy C Shoemaker; Scott A Jackson
Journal: Nature Date: 2010-01-14 Impact factor: 49.962

7. SoyBase, the USDA-ARS soybean genetics and genomics database.

Authors: David Grant; Rex T Nelson; Steven B Cannon; Randy C Shoemaker
Journal: Nucleic Acids Res Date: 2009-12-14 Impact factor: 16.971

8. Sequencing and analysis of approximately 40,000 soybean cDNA clones from a full-length-enriched cDNA library.

Authors: Taishi Umezawa; Tetsuya Sakurai; Yasushi Totoki; Atsushi Toyoda; Motoaki Seki; Atsushi Ishiwata; Kenji Akiyama; Atsushi Kurotani; Takuhiro Yoshida; Keiichi Mochida; Mie Kasuga; Daisuke Todaka; Kyonoshin Maruyama; Kazuo Nakashima; Akiko Enju; Saho Mizukado; Selina Ahmed; Kyoko Yoshiwara; Kyuya Harada; Yasutaka Tsubokura; Masaki Hayashi; Shusei Sato; Toyoaki Anai; Masao Ishimoto; Hideyuki Funatsuki; Masayoshi Teraishi; Mitsuru Osaki; Takuro Shinano; Ryo Akashi; Yoshiyuki Sakaki; Kazuko Yamaguchi-Shinozaki; Kazuo Shinozaki
Journal: DNA Res Date: 2008-10-16 Impact factor: 4.458

8 in total

7 in total

Review 1. Genetic and Genomic Resources for Soybean Breeding Research.

Authors: Jakob Petereit; Jacob I Marsh; Philipp E Bayer; Monica F Danilevicz; William J W Thomas; Jacqueline Batley; David Edwards
Journal: Plants (Basel) Date: 2022-04-27

Review 2. Improvement of Soybean; A Way Forward Transition from Genetic Engineering to New Plant Breeding Technologies.

Authors: Saleem Ur Rahman; Evan McCoy; Ghulam Raza; Zahir Ali; Shahid Mansoor; Imran Amin
Journal: Mol Biotechnol Date: 2022-02-04 Impact factor: 2.695

3. Soybean Proteome Database 2012: update on the comprehensive data repository for soybean proteomics.

Authors: Hajime Ohyanagi; Katsumi Sakata; Setsuko Komatsu
Journal: Front Plant Sci Date: 2012-05-30 Impact factor: 5.753

4. Identification of quantitative trait loci for flowering time by a combination of restriction site-associated DNA sequencing and bulked segregant analysis in soybean.

Authors: Satoshi Watanabe; Chikaharu Tsukamoto; Tatsuki Oshita; Tetsuya Yamada; Toyoaki Anai; Akito Kaga
Journal: Breed Sci Date: 2017-05-30 Impact factor: 2.086