Literature DB >> 20008513

SoyBase, the USDA-ARS soybean genetics and genomics database.

David Grant1, Rex T Nelson, Steven B Cannon, Randy C Shoemaker.   

Abstract

SoyBase, the USDA-ARS soybean genetic database, is a comprehensive repository for professionally curated genetics, genomics and related data resources for soybean. SoyBase contains the most current genetic, physical and genomic sequence maps integrated with qualitative and quantitative traits. The quantitative trait loci (QTL) represent more than 18 years of QTL mapping of more than 90 unique traits. SoyBase also contains the well-annotated 'Williams 82' genomic sequence and associated data mining tools. The genetic and sequence views of the soybean chromosomes and the extensive data on traits and phenotypes are extensively interlinked. This allows entry to the database using almost any kind of available information, such as genetic map symbols, soybean gene names or phenotypic traits. SoyBase is the repository for controlled vocabularies for soybean growth, development and trait terms, which are also linked to the more general plant ontologies. SoyBase can be accessed at http://soybase.org.

Entities:  

Mesh:

Year:  2009        PMID: 20008513      PMCID: PMC2808871          DOI: 10.1093/nar/gkp798

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The last decade has seen a significant increase in soybean [Glycine max (L.) Merr.] research. The first molecular genetic map of only a few hundred RFLP markers has grown to over 4000 loci encompassing RFLP, RAPD, SSR and SNP markers (1). Over a thousand quantitative trait loci (QTL) representing more than 90 agronomically important traits have been mapped in soybean. More than 1.4 million nucleotide and expressed sequence tag (EST) sequences are available in public repositories. Macro- and micro-arrays based on ESTs have been developed and are being used to generate expression data for thousands of genes under different experimental conditions. The recently completed initial assembly of the genomic sequence of the cultivar ‘Williams 82’ is available (Schmutz et al., in preparation). Sequence annotation tracks for gene calls, the BAC-based physical map (2), the Affymetrix SoyChip1 probe sets and numerous gene expression projects are provided in SoyBase. Based on the needs of the soybean research community, the USDA-ARS developed SoyBase as a central repository for genetic and genomic data and related resources for soybean, as well as a single starting point for access to other laboratory-specific web pages and specialized data sets. In this article we present an overview of the major sections of SoyBase and some of the tools available for data mining and searching the database.

RESULTS AND DISCUSSION

General directions for using SoyBase

SoyBase is organized into four broad sections (Figure 1):
Figure 1.

The main navigation aids that are present on all SoyBase pages, and which allow the user to quickly move between the sections of the web site. The main tabs (i.e. Maps or Resources) link immediately to the selected page. Entries on the second line can be used to move directly to a specific SoyBase page.

genetic, QTL and physical maps genome sequence and annotations analysis tools and data mining tools data and community resources The main navigation aids that are present on all SoyBase pages, and which allow the user to quickly move between the sections of the web site. The main tabs (i.e. Maps or Resources) link immediately to the selected page. Entries on the second line can be used to move directly to a specific SoyBase page.

Genetic and physical maps

The genetic and physical (FPC) maps in SoyBase are displayed using the comparative map viewer CMap (a component of Generic Model Organism Database Project; http://gmod.org; (3)). In addition to providing views of single linkage groups, CMap provides the ability to simultaneously view related maps. Soybean is a recent tetraploid, having undergone polyploidy an estimated 10–15-million years ago (4,5). The ability to view multiple evolutionarily related chromosomes or regions is, therefore, particularly useful in soybean. Figure 2 shows an example of genetic maps from related regions of two homoeologous chromosomes.
Figure 2.

Homoeologous regions of soybean linkage groups C1 and C2 (chromosomes Gm04 and Gm06). Broad QTL classes are indicated by color. Only a subset of the QTL are shown so that potentially related QTL can easily be recognized.

Homoeologous regions of soybean linkage groups C1 and C2 (chromosomes Gm04 and Gm06). Broad QTL classes are indicated by color. Only a subset of the QTL are shown so that potentially related QTL can easily be recognized. All map features are linked to extensive textual data in SoyBase. Genetic markers have been positioned on both the genetic and sequence maps, thus providing facile movement between these two views of the soybean genome. Contextual menus are used to access the textual data and to move between the genetic and genome sequence views of the chromosomes.

Genomic sequence

The ‘Williams 82’ genomic sequence and associated features are accessible using the GBrowse genome viewer (http://gmod.org). The genomic sequence, sequenced by the DOE-JGI and assembled in 2008 by a multi-agency consortium, spans nearly 1 billion bases (Soybean Genome Sequencing Consortium, http://www.phytozome.net/soybean.php; and Schmutz et al., in preparation). Figure 3 shows a region of chromosome Gm01 with some of the annotation tracks that are available. Contextual menus provide additional information for sequence features and links between the sequence and genetic maps.
Figure 3.

Genome sequence view of a region of soybean chromosome Gm01. The tracks shown here are genetic markers, gene models from the 1.01 annotation (Soybean Genome Sequencing Consortium), and locations of Affymetrix probesets. For each of these tracks, a click on a feature provides a contextual menu with relevant links: to CMap from markers; to several sequence tools from gene models; and to PLEXdb (6) from Affymetrix probe sets. Many other annotation tracks are available, including intragenomic synteny blocks, the BAC-based FPC contigs that comprise the physical map, all of the fingerprinted and end-sequenced BACs, and EST contigs from both soybean and other legumes.

Genome sequence view of a region of soybean chromosome Gm01. The tracks shown here are genetic markers, gene models from the 1.01 annotation (Soybean Genome Sequencing Consortium), and locations of Affymetrix probesets. For each of these tracks, a click on a feature provides a contextual menu with relevant links: to CMap from markers; to several sequence tools from gene models; and to PLEXdb (6) from Affymetrix probe sets. Many other annotation tracks are available, including intragenomic synteny blocks, the BAC-based FPC contigs that comprise the physical map, all of the fingerprinted and end-sequenced BACs, and EST contigs from both soybean and other legumes. The number of studies on gene expression using either the Affymetrix SoyChip1 or next generation sequencing strategies is rapidly increasing. To accommodate these data we have partnered with PLEXdb (http://plexdb.org; Wise et al., 2007). In addition a subset of the short read (i.e. 454 or Solexa) expression data, as well as the Affymetrix probe sets, are presented in the SoyBase genome sequence viewer. Links are provided to PLEXdb to allow the user to further explore the experimental design and data. Links are also available from PLEXdb to SoyBase to allow the experimental results to be analyzed in the context of the genetic and genomic data in SoyBase.

Tools

SoyBase contains a number of analysis tools, including – Sequence similarity searches of Glycine max and G. soja sequences: BLAST and BLAT can be used to search all or a subset of the Glycine sequences (ESTs, gene indices, etc.), including the genome sequence. The results of a BLAT search against the whole genome sequence are shown directly in the genome sequence browser. –Search Glycine max and G. soja EST sequence libraries by keywords: Text based searches of the library and individual EST annotations can be performed. Results are returned in a format suitable for viewing in a web browser or for import into other analysis programs or databases. –Search the annotations for Affymetrix SoyChip contig sequences: A comprehensive annotation has been developed for each of the probe sets on the Affymetrix SoyChip. A user can provide a file of probe set names of interest and get a report containing their annotations. –Search for unigene sequences that match Affymetrix SoyChip probe sequences: Several unigene sets have been constructed for soybean ESTs using different assembly parameters. This tool allows a user to get the unigene(s) associated with one or more probe sets from each of the unigene assemblies. –Browse or search soybean ontologies: SoyBase contains the most complete trait, growth and developmental ontologies available. These ontologies were developed by SoyBase staff to suggest a controlled vocabulary for soybean field growth stages (SoyWGR), individual plant development (SoyGRO) and traits (SoyTO). Where applicable, soybean specific terms have been associated with their Plant Ontology (PO) and Gramene Plant Trait Ontology (TO) synonyms to facilitate cross species comparisons.

Resources

SoyBase is intended to be a central resource for soybean researchers. In addition to the data, maps and tools described above, a number of community resources are made available either as data or as links to other sites. These include, among others, links to other soybean-centric websites and laboratories, a list of upcoming meetings of potential interest to soybean researchers and links to an extensive collection of USDA sites about soybean breeding and production.

Data availability

All of the data in SoyBase is freely available. Please use the ‘Contact Us’ page at SoyBase or email the curator (David Grant, david.grant@ars.usda.gov) to request any specific subset of the data.

FUTURE PLANS

SoyBase is continuously updated to include new data as they become available. In addition new data types are incorporated and linked to the existing data when appropriate. Some new data types that will soon be added to SoyBase include – allele data and frequencies for the genetic markers. –genetic marker-based haplotypes for the soybean germplasm collection. –integrated maps and data for the identified retrotransposon and DNA transposable elements in soybean.

FUNDING

United States Department of Agriculture, Agricultural Research Service (USDA-ARS). Funding for open access charge: USDA-ARS. Conflict of interest statement. None declared.
  5 in total

1.  The generic genome browser: a building block for a model organism system database.

Authors:  Lincoln D Stein; Christopher Mungall; ShengQiang Shu; Michael Caudy; Marco Mangone; Allen Day; Elizabeth Nickerson; Jason E Stajich; Todd W Harris; Adrian Arva; Suzanna Lewis
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

2.  A soybean transcript map: gene distribution, haplotype and single-nucleotide polymorphism analysis.

Authors:  Ik-Young Choi; David L Hyten; Lakshmi K Matukumalli; Qijian Song; Julian M Chaky; Charles V Quigley; Kevin Chase; K Gordon Lark; Robert S Reiter; Mun-Sup Yoon; Eun-Young Hwang; Seung-In Yi; Nevin D Young; Randy C Shoemaker; Curtis P van Tassell; James E Specht; Perry B Cregan
Journal:  Genetics       Date:  2007-03-04       Impact factor: 4.562

3.  Placing paleopolyploidy in relation to taxon divergence: a phylogenetic analysis in legumes using 39 gene families.

Authors:  B E Pfeil; J A Schlueter; R C Shoemaker; J J Doyle
Journal:  Syst Biol       Date:  2005-06       Impact factor: 15.683

4.  Mining EST databases to resolve evolutionary events in major crop species.

Authors:  Jessica A Schlueter; Phillip Dixon; Cheryl Granger; David Grant; Lynn Clark; Jeff J Doyle; Randy C Shoemaker
Journal:  Genome       Date:  2004-10       Impact factor: 2.166

5.  Microsatellite discovery from BAC end sequences and genetic mapping to anchor the soybean physical and genetic maps.

Authors:  Randy C Shoemaker; David Grant; Terry Olson; Wesley C Warren; Rod Wing; Yeisoo Yu; HyeRan Kim; Perry Cregan; Bindu Joseph; Montona Futrell-Griggs; Will Nelson; Jon Davito; Jason Walker; John Wallis; Colin Kremitski; Debbie Scheer; Sandra W Clifton; Tina Graves; Henry Nguyen; Xiaolei Wu; Mingcheng Luo; Jan Dvorak; Rex Nelson; Steven Cannon; Jeff Tomkins; Jeremy Schmutz; Gary Stacey; Scott Jackson
Journal:  Genome       Date:  2008-04       Impact factor: 2.166

  5 in total
  193 in total

1.  Detection of novel QTLs for foxglove aphid resistance in soybean.

Authors:  Ju Seok Lee; Min-ho Yoo; Jin Kyo Jung; Kristin D Bilyeu; Jeong-Dong Lee; Sungtaeg Kang
Journal:  Theor Appl Genet       Date:  2015-04-23       Impact factor: 5.699

2.  Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean.

Authors:  Zhengkui Zhou; Yu Jiang; Zheng Wang; Zhiheng Gou; Jun Lyu; Weiyu Li; Yanjun Yu; Liping Shu; Yingjun Zhao; Yanming Ma; Chao Fang; Yanting Shen; Tengfei Liu; Congcong Li; Qing Li; Mian Wu; Min Wang; Yunshuai Wu; Yang Dong; Wenting Wan; Xiao Wang; Zhaoli Ding; Yuedong Gao; Hui Xiang; Baoge Zhu; Suk-Ha Lee; Wen Wang; Zhixi Tian
Journal:  Nat Biotechnol       Date:  2015-02-02       Impact factor: 54.908

3.  Chromosomal distribution of soybean retrotransposon SORE-1 suggests its recent preferential insertion into euchromatic regions.

Authors:  Kenta Nakashima; Jun Abe; Akira Kanazawa
Journal:  Chromosome Res       Date:  2018-05-22       Impact factor: 5.239

Review 4.  Systems biology of seeds: decoding the secret of biochemical seed factories for nutritional security.

Authors:  Anil Kumar; Rajesh Kumar Pathak; Aranyadip Gayen; Supriya Gupta; Manoj Singh; Charu Lata; Himanshu Sharma; Joy Kumar Roy; Sanjay Mohan Gupta
Journal:  3 Biotech       Date:  2018-10-24       Impact factor: 2.406

5.  Genetic analysis and fine mapping of RpsJS, a novel resistance gene to Phytophthora sojae in soybean [Glycine max (L.) Merr].

Authors:  Jutao Sun; Lihong Li; Jinming Zhao; Jing Huang; Qiang Yan; Han Xing; Na Guo
Journal:  Theor Appl Genet       Date:  2014-01-14       Impact factor: 5.699

Review 6.  Integration of omics approaches to understand oil/protein content during seed development in oilseed crops.

Authors:  Manju Gupta; Pudota B Bhaskar; Shreedharan Sriram; Po-Hao Wang
Journal:  Plant Cell Rep       Date:  2016-10-27       Impact factor: 4.570

Review 7.  Functional genomics of soybean for improvement of productivity in adverse conditions.

Authors:  Lam-Son Phan Tran; Keiichi Mochida
Journal:  Funct Integr Genomics       Date:  2010-06-27       Impact factor: 3.410

8.  Genome-wide analysis of two-component systems and prediction of stress-responsive two-component system members in soybean.

Authors:  Keiichi Mochida; Takuhiro Yoshida; Tetsuya Sakurai; Kazuko Yamaguchi-Shinozaki; Kazuo Shinozaki; Lam-Son Phan Tran
Journal:  DNA Res       Date:  2010-09-03       Impact factor: 4.458

9.  Genome-wide identification and analysis of the B3 superfamily of transcription factors in Brassicaceae and major crop plants.

Authors:  Fred Y Peng; Randall J Weselake
Journal:  Theor Appl Genet       Date:  2013-02-02       Impact factor: 5.699

10.  SoyTEdb: a comprehensive database of transposable elements in the soybean genome.

Authors:  Jianchang Du; David Grant; Zhixi Tian; Rex T Nelson; Liucun Zhu; Randy C Shoemaker; Jianxin Ma
Journal:  BMC Genomics       Date:  2010-02-17       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.