Literature DB >> 17202174

MaizeGDB's new data types, resources and activities.

Carolyn J Lawrence¹, Mary L Schaeffer, Trent E Seigfried, Darwin A Campbell, Lisa C Harper.

Abstract

MaizeGDB is the Maize Genetics and Genomics Database. Available at MaizeGDB are diverse data that support maize research including maps, gene product information, loci and their various alleles, phenotypes (both naturally occurring and as a result of directed mutagenesis), stocks, sequences, molecular markers, references and contact information for maize researchers worldwide. Also available through MaizeGDB are various community support service bulletin boards including the Editorial Board's list of high-impact papers, information about the Annual Maize Genetics Conference and the Jobs board where employment opportunities are posted. Reported here are data updates, improvements to interfaces and changes to standard operating procedures that have been made during the past 2 years. MaizeGDB is freely available and can be accessed online at http://www.maizegdb.org.

Entities: Disease Gene Species

Mesh：

Year: 2007 PMID： 17202174 PMCID： PMC1899092 DOI： 10.1093/nar/gkl1048

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Maize (Zea mays ssp. mays) has long been the number one production crop in the United States, and in 2001 it also became number one in the world. As maize is economically important and also serves as a model organism for genetics research, it is one of the most highly researched organisms in existence. The maize genome has an especially high level of DNA sequence polymorphism and extended regions of non-homology between inbred lines (1,2), hence the diversity represented by the maize gene pool is unparalleled in both a phenotypic and molecular sense. This provides a unique vehicle to explore questions in evolution, domestication, development, trait expression, functional allelic diversity and the interrelated processes that shape such events and their outcomes. The application of new technologies and bioinformatic tools coupled with thorough phenotypic evaluation for useful traits and molecular characterization of diverse maize germplasm offers the potential for significant discovery via translational genomics. The goal of turning the identification and evaluation of functional and evolutionarily important allelic variation into a comprehensive genomics activity is dependent on being able to associate diverse information in a seamless manner. Maize Genetics and Genomics Database (MaizeGDB) is the community resource charged with developing informatic solutions for storing, displaying and linking comprehensive maize data, so that they are made easily accessible to researchers worldwide. Described here are updates and improvements to MaizeGDB that have been made over the course of the past 2 years. Throughout this paper direct links are listed parenthetically to enable direct access to relevant data. Those data also could be accessed via search and browse mechanisms available on the MaizeGDB front page. For example, to find a map that is listed parenthetically, one could search either ‘all records’ or ‘maps’ using the search bar at the top (and bottom) of any MaizeGDB page (Figure 1a) using the map name. That same data also could be accessed from the front page by navigating to the Maps Data Center using the link shown in Figure 1b. On the Maps Data Center Search page, the Simple or Advanced Search mechanisms could be used to search maps by name, type, or using other limiters. Other data types also can be accessed from the front page using similar tactics.

Figure 1

Important information, data centers, tools and news items are accessible from the MaizeGDB home page. Data and bulletin boards as well as links to noteworthy or high profile projects are accessible directly on the front page. All MaizeGDB pages have the same header, which is loaded with functionality enabling, e.g. searches of all data from any page (a) and access to tools including the Community Curation Tools. The ‘Maps’ Data Center (b) can limit results by contained loci, chromosome, source or mapping panel and also allows direct access to unique map types including the Recombination Nodule maps through its ‘Map Reports and Tools’ section. Likewise, the ‘Stocks’ Data Center (c) enables queries by focus linkage group, genotypic variation, karyotypic variation and other limitors. Bulletin boards that keep researchers connected with the community include the Editorial Board (d), the Maize Genetics Executive Committee pages (e), and the Annual Maize Genetics Conference, (which is referred to simply as ‘The Maize Meeting’) site (f). Dates of database updates are available directly on the front page (g), as are important major efforts of interest to all maize researchers like the Maize Genome Sequencing project.

NEW DATA TYPES

Since the initial announcement in 2004 that MaizeGDB was up and running (3), the database has expanded to include new data types, such as TILLING data [(4); ], cytological maps with associated images (e.g. ), and the maize Recombination Nodule maps [(5,6); ]. All new data types are made available alongside related information and are accessible through mechanisms that seamlessly integrate with the site's existing functionalities. Recombination Nodule maps are uncommon: at present, they are only available for maize, mouse, and tomato (5,7,8). As many researchers are unfamiliar with this unique data type and to demonstrate the utility of the newly available maize Recombination Nodule maps, an example of how they can be used to speed up experiments that utilize the maize translocation stocks follows. A researcher wants to determine whether the gene product of her new mutant gof1, which maps to 3L within 1 cM of tub6, acts cell autonomously using a mosaic analysis. Since there are no suitable cell autonomous markers known to be proximal to gof1, she decides to use an A-A translocation to bring gof1+ distal to a cell autonomous marker on another chromosome. To do this, she needs to find a translocation stock with a breakpoint on 3L proximal to her gene of interest, and a breakpoint on another chromosome that is distal to a gene that can be used as a cell autonomous marker. To find a list of available translocations, she uses the link on the front page of MaizeGDB to get to the Stocks Data Center (Figure 1c), scrolls down the page (past the green ‘Advanced Stock Query’ box) to Maize Genetics Cooperation Stock Center Resources (9), and clicks the link for the ‘Stock Center Catalog.’ She clicks the link for ‘Reciprocal Translocation (comprehensive list)’ to arrive at , and decides to try 1049B T1-3(5242) (3L.65; 1L.90). In order to determine the approximate positions of these breakpoints relative to genetically mapped loci, she goes to the Maps Data Center (Figure 1b) and navigates to the Recombination Nodule maps. From here, she clicks the link toward the bottom of the page to go to the Morgan2McClintock Translator [(10); ]. Here, she chooses chromosome 3, pastes the entire ‘Genetic 2005 3’ map from MaizeGDB (accessible at ) into the text box and hits the button marked ‘Calculate!’. The output file shows that 3L.65 lies about 5 cM proximal to tub6. This means that the breakpoint is proximal to her gene of interest! Using the Translator again for chromosome 1, she finds the position of 1L.90 relative to the ‘Genetic 2005 1’ map (accessible at ). The output table shows that lw1, a suitable cell autonomous marker (11), lies between cent1 and the breakpoint. Using the stock for T1-3(5242), she can set up a stock heterozygous for gof1, lw1, and the translocation where gof1+ and lw1+ reside on the translocation, and the recessive mutant alleles reside on the normal (non-translocation) chromosomes. Without access to the maize Recombination Nodule map data and the Morgan2McClintock Translator, cytological and genetic maps cannot be integrated directly. Setting up such an experiment would have required many additional tests using various translocation stocks.

DATA UPDATES

All types of data (references, loci, maps, etc.) are updated regularly as time and human resources allow. Listed here are a few of the major recent updates to content. MaizeGDB's sequence set is made up of all public maize sequences including expressed sequence tag (EST), cDNA, genome survey sequence (GSS), sequence tagged site (STS), HTC and genomic DNA sequences from GenBank (12) as well as the Uniprot (13) protein sequence set. This dataset is updated monthly utilizing a custom pipeline set up by workers at PlantGDB (14). Whereas in the past only the Z.mays ssp. mays sequence set was included in the update, a recent change was made to the sequence set update pipeline to include sequences for all subspecies of Z.mays. Also stored at MaizeGDB is contig membership information for the PlantGDB GSS and PUT (putative unique transcript) contigs as well as the TIGR TC (tentative consensus) EST contig set [12 October 2005 release; (15)]. The IBM2 FPC0507 maps () represent the anchored BAC finger print contigs which are currently being used to guide the B73 Maize Genome Sequencing Project. These maps were derived using anchor information for 414 contigs as assigned in the July 2005 manually edited FPC product [(16); ]. Represented on the map are loci where at least 2 BACs in the contig were empirically associated with a molecular marker. Positions for loci were derived using a hybrid coordinate comprised of an integer representing the nearest genetic anchor point followed by a decimal and the FPC consensus band (CB) coordinate. Loci are associated with the defining BACs, markers, and marker sequences, and are linked to the current contig representation at Arizona. This map adds over 25 000 new loci to MaizeGDB, most of which are associated with overgo probes designed to detect full-length cDNAs (17). This map serves to integrate the genetic and physical maps and improves high resolution mapping capabilities for researchers working to localize a trait or phenotype of interest to a small region of the genome, thus facilitating, e.g. candidate gene discovery by chromosome walking. Initiated to assist in anchoring BAC contigs and continued to support maize genetics research, the IBM Neighbors product approximates the genetic map orders of all loci mapped to better than 5 cM and most recently includes the loci which have only been placed onto anchored FPC contigs. Like the anchored BAC FPC contigs, the IBM Neighbors representation utilizes the IBM2 map as the framework. Each version is maintained, and the primary map source for a locus coordinate is displayed. The most recent version () includes well-ordered mutants based on the Genetic 2005 compilation (18), which incorporated the UMC 98 maps (19). Over 5300 newly mapped loci, based on sequenced probes where most were derived from a cDNA, were provided to MaizeGDB from mapping projects that are using high-resolution intermated recombinant inbred panels of stocks. These came from: the Schnable lab (3391 insertion–deletion loci on 2 map versions, IBM IDP +MMP versions 4 and 5; ), Matthieu Falque [1680 RFLP loci on IBM GNP and GNP LHRF maps; (20)], and the Community Mapping Service (289 loci; described below).

NEW FUNCTIONALITIES

Along with additions of new data, MaizeGDB personnel remain committed to maintaining and improving upon the interface to the database. New map displays have ‘sequence view’ (which shows the sequences associated with each locus on the map) and ‘primer view’, (which shows the primers and probes for each locus on the map) options, which are accessible toward the top of all default map views (e.g. ). Search algorithms have been refined to allow researchers to simply enter a term in the search field toward the top of any page at MaizeGDB (see Figure 1a) and quickly find relevant results, and summary information is now available on search results pages so that the most relevant records can easily be identified. For example, if a researcher were to search all records with the string ‘r1’, over 170 loci would be found. To help with locating the locus for an exact match, i.e. the locus r1 colored1, the exact match is shown at the top of the list of results. In addition, though the appearance of many data displays has not changed, the underlying code has been rewritten to improve load times and to optimize computational efficiency. In an effort to improve access to diverse maize data, the MaizeGDB interface has been modified to include an abundance of linkages to other databases including the Plant Ontology Consortium [(21); ], Gramene [(22); ] and the Maize Sequencing Project's genome browser. Data displays provide abundant context-sensitive linkages to records in other databases, enabling users to visit, e.g. a gene record at MaizeGDB and with just a click quickly find related sequences, annotated maps and similar sequences at other websites. Recently the Community Curation Tools (accessible through the ‘tools’ link at the top of any page at MaizeGDB; Figure 1) were updated to enable the entry of QTL data. Insights gained from experience with QTL data entry into the legacy MaizeDB (23) were leveraged in planning the functionality of this module, and linking of trait and map location to germplasm containing the superior allele is enforced. New automatic nomenclature features ensure consistency and minimize data entry effort. General functionality of the QTL module is consistent with that of the other Community Curation Tools described previously (24).

COMMUNITY SUPPORT ACTIVITIES

MaizeGDB hosts an Editorial Board (Figure 1d) whose members communicate with MaizeGDB personnel monthly to report their selections of current and historic literature germane to maize research. The list of Editorial Board selections is appropriate to guide journal clubs or for use by individuals keen to remain abreast of the advances relevant to maize research. It can be accessed at . MaizeGDB personnel support the Maize Genetics Executive Committee (MGEC; Figure 1e) by providing a venue for them to contact and interact with the community of maize researchers. MaizeGDB personnel create and administer customized community surveys and handle and monitor elections for the MGEC, (which involves the creation of methods for anonymous balloting and key-based restrictions to preclude stuffing the ballot box). These services help the MGEC to understand the needs of and communicate clearly with the community of maize researchers. Note also that author M. L. Schaeffer is a member of the MGEC. The Annual Maize Genetics Conference (see Figure 1f) is growing by leaps and bounds, with a nearly 20% increase in attendance within the last 3 years and a near doubling in the number of abstracts submitted. Workers at MaizeGDB created a set of tools to accept abstract submissions and to manage review of the abstracts by Conference Steering Committee members. MaizeGDB personnel also maintain the mailing list for the Conference Steering Committee, and authors T. E. Seigfried and M. L. Schaeffer serve on the conference steering committee in an ex officio capacity and also assemble and print the conference program. MaizeGDB hosts the Maize Newsletter (MNL), and MaizeGDB Curator M. L. Schaeffer is a co-editor of that publication. The main MNL site is at MaizeGDB (), and new volumes are staged at the University of Missouri-Columbia (). Contributions from collaborators are posted as received, and, with minor editing, redacted for printing once a year. The MNL also includes annual reports from the Maize Genetics Cooperation–Stock Center, the MaizeGDB, and any new map syntheses developed by MaizeGDB and collaborators. Funding for the redaction, printing, and mailing are from an endowment established by contributions from collaborators. CIMDE is the community mapping service originally provided by personnel working on the Maize Mapping Project [(25); ]. Author M. L. Schaeffer currently manages CIMDE. Map positions are determined using 580 framework loci and MapMaker software (26), and are returned within 2 weeks of submission of raw map scores. When those data become public, they are incorporated into the Community IBM Map (cIBM) along with related documentation, such as the contributor, nucleotide sequence accessions, encoded proteins and literature citations. The most recent update (cIBM2005; ) includes the published maps of the Genoplante Consortium (20) and data submitted for inclusion in the public IBM Neighbors map (described above). The data sources are attributed, and sequence accessions related to mapped loci as well as sequence details, such as primers required to reproduce the mappings, are actively solicited and annotated.

DATABASE AND AVAILABILITY

Standard operating procedures, accessibility and machine architecture are reviewed in detail elsewhere (24). The following is a brief description of how the project databases are administered and explains availability and methods of access. The MaizeGDB schema is accessible online at . Presently there are three copies of the database and interface, which exist on three identical servers. The interface on each server interacts with data on the local copy of the database, thereby allowing the maintenance of a production environment (i.e. the copy accessed through ), a curation or staging environment, and an isolated testing and development environment. The development environment functions as a playground where data manipulation and interface development are tested. The curation database stores the most current data. As data are entered into the curation database (by researchers using the Community Curation Tools and by professional curators), they are initially listed as non-public and can only be viewed by MaizeGDB staff members. Once the data are reviewed, a curation level tag is changed so that the new records will become publicly accessible. Updates to production are carried out by replacing the existing production copy of the database with a duplicate of the curation database and the latest sequence files from PlantGDB. This update generally occurs on the first Tuesday of each month (see Figure 1g). The curation database is backed up on a daily basis and is available for download () for those who have Oracle RDBMS installed locally. Requests to gain read-only SQL access to the database should be directed by email to mgdb@iastate.edu. Data housed at MaizeGDB are in the public domain, hence they are freely available for use without a license.

FUTURE PLANS

The genome of maize inbred line B73 is being sequenced, and the creation and public availability of the official site (called the Maize Genome Browser; ) is in the offing. The Maize Genome Browser has embedded links to MaizeGDB throughout, and context-sensitive links from MaizeGDB into the Maize Genome Browser are planned. By creating links to the Maize Genome Browser, MaizeGDB can connect researchers with up-to-date views of the maize genome as it is sequenced without bearing the responsibility of supporting an independent genome browser for maize. To learn more about the Maize Genome Sequencing Consortium's plans and to find updates on their progress, visit . In most cases, a model organism's official gene models are housed at the model organism database (MOD). As the maize genome is being sequenced and it is anticipated that MaizeGDB (the MOD for maize) will be charged with storing and making available the official gene models, plans are in the works to put together an infrastructure for supporting the storage of this new data type and to create a version control system to allow for the storage of each major gene model release. Not all datatypes are currently integrated with sequence data. Breeders find it difficult to locate all genomic and phenotypic data for plant germplasm collections because breeding and sequence data are housed in separate, disconnected databases. Although MaizeGDB stores maize data related to genetics and genomics, most historical, geographic origin, characterization and evaluation data associated with the National Plant Germplasm System's Plant Genetic Resources collections are housed in the Germplasm Resource Information Network (GRIN; ). Work to integrate MaizeGDB with GRIN is a high-priority item for development in the coming year so that breeders are enabled to more easily associate genetic and genomic data with traditional crop improvement information.

22 in total

1. Male mouse recombination maps for each autosome identified by chromosome painting.

Authors: Lutz Froenicke; Lorinda K Anderson; Johannes Wienberg; Terry Ashley
Journal: Am J Hum Genet Date: 2002-11-12 Impact factor: 11.025

2. Intraspecific violation of genetic colinearity and its implications in maize.

Authors: Huihua Fu; Hugo K Dooner
Journal: Proc Natl Acad Sci U S A Date: 2002-06-11 Impact factor: 11.205

3. Maintaining collections of mutants for plant functional genomics.

Authors: Randy Scholl; Martin M Sachs; Doreen Ware
Journal: Methods Mol Biol Date: 2003

4. Comparative plant genomics resources at PlantGDB.

Authors: Qunfeng Dong; Carolyn J Lawrence; Shannon D Schlueter; Matthew D Wilkerson; Stefan Kurtz; Carol Lushbough; Volker Brendel
Journal: Plant Physiol Date: 2005-10 Impact factor: 8.340

5. The maize genetics and genomics database. The community resource for access to diverse maize data.

Authors: Carolyn J Lawrence; Trent E Seigfried; Volker Brendel
Journal: Plant Physiol Date: 2005-05 Impact factor: 8.340

6. FPC Web tools for rice, maize, and distribution.

Authors: Vishal Pampanwar; Friedrich Engler; James Hatfield; Steve Blundy; Gaurav Gupta; Carol Soderlund
Journal: Plant Physiol Date: 2005-05 Impact factor: 8.340

7. Two-dimensional spreads of synaptonemal complexes from solanaceous plants. VI. High-resolution recombination nodule map for tomato (Lycopersicon esculentum).

Authors: J D Sherman; S M Stack
Journal: Genetics Date: 1995-10 Impact factor: 4.562

8. High-resolution crossover maps for each bivalent of Zea mays using recombination nodules.

Authors: Lorinda K Anderson; Gregory G Doyle; Brian Brigham; Jenna Carter; Kristina D Hooker; Ann Lai; Mindy Rice; Stephen M Stack
Journal: Genetics Date: 2003-10 Impact factor: 4.562

9. Anchoring 9,371 maize expressed sequence tagged unigenes to the bacterial artificial chromosome contig map by two-dimensional overgo hybridization.

Authors: Jack Gardiner; Steven Schroeder; Mary L Polacco; Hector Sanchez-Villeda; Zhiwei Fang; Michele Morgante; Tim Landewe; Kevin Fengler; Francisco Useche; Michael Hanafey; Scott Tingey; Hugh Chou; Rod Wing; Carol Soderlund; Edward H Coe
Journal: Plant Physiol Date: 2004-03-12 Impact factor: 8.340

10. MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations.

Authors: E S Lander; P Green; J Abrahamson; A Barlow; M J Daly; S E Lincoln; L A Newberg; L Newburg
Journal: Genomics Date: 1987-10 Impact factor: 5.736

28 in total

1. MASCP Gator: an aggregation portal for the visualization of Arabidopsis proteomics data.

Authors: Hiren J Joshi; Matthias Hirsch-Hoffmann; Katja Baerenfaller; Wilhelm Gruissem; Sacha Baginsky; Robert Schmidt; Waltraud X Schulze; Qi Sun; Klaas J van Wijk; Volker Egelhofer; Stefanie Wienkoop; Wolfram Weckwerth; Christophe Bruley; Norbert Rolland; Tetsuro Toyoda; Hirofumi Nakagami; Alexandra M Jones; Steven P Briggs; Ian Castleden; Sandra K Tanz; A Harvey Millar; Joshua L Heazlewood
Journal: Plant Physiol Date: 2010-11-12 Impact factor: 8.340

2. Advancing cell biology and functional genomics in maize using fluorescent protein-tagged lines.

Authors: Amitabh Mohanty; Anding Luo; Stacy DeBlasio; Xingyuan Ling; Yan Yang; Dorothy E Tuthill; Katherine E Williams; Daniel Hill; Tara Zadrozny; Agnes Chan; Anne W Sylvester; David Jackson
Journal: Plant Physiol Date: 2009-02 Impact factor: 8.340

3. A community-based annotation framework for linking solanaceae genomes with phenomes.

Authors: Naama Menda; Robert M Buels; Isaak Tecle; Lukas A Mueller
Journal: Plant Physiol Date: 2008-06-06 Impact factor: 8.340

Review 4. Multiple models for Rosaceae genomics.

Authors: Vladimir Shulaev; Schuyler S Korban; Bryon Sosinski; Albert G Abbott; Herb S Aldwinckle; Kevin M Folta; Amy Iezzoni; Dorrie Main; Pere Arús; Abhaya M Dandekar; Kim Lewers; Susan K Brown; Thomas M Davis; Susan E Gardiner; Daniel Potter; Richard E Veilleux
Journal: Plant Physiol Date: 2008-05-16 Impact factor: 8.340

5. Brassinosteroid control of sex determination in maize.

Authors: Thomas Hartwig; George S Chuck; Shozo Fujioka; Antje Klempien; Renate Weizbauer; Devi Prasad V Potluri; Sunghwa Choe; Gurmukh S Johal; Burkhard Schulz
Journal: Proc Natl Acad Sci U S A Date: 2011-11-21 Impact factor: 11.205

6. Choosing a genome browser for a Model Organism Database: surveying the maize community.

Authors: Taner Z Sen; Lisa C Harper; Mary L Schaeffer; Carson M Andorf; Trent E Seigfried; Darwin A Campbell; Carolyn J Lawrence
Journal: Database (Oxford) Date: 2010-07-06 Impact factor: 3.451

7. Gramene QTL database: development, content and applications.

Authors: Junjian Ni; Anuradha Pujar; Ken Youens-Clark; Immanuel Yap; Pankaj Jaiswal; Isaak Tecle; Chih-Wei Tung; Liya Ren; William Spooner; Xuehong Wei; Shuly Avraham; Doreen Ware; Lincoln Stein; Susan McCouch
Journal: Database (Oxford) Date: 2009-05-08 Impact factor: 3.451