| Literature DB >> 30407532 |
John L Portwood1, Margaret R Woodhouse2, Ethalinda K Cannon1, Jack M Gardiner3, Lisa C Harper1, Mary L Schaeffer4, Jesse R Walsh1, Taner Z Sen5,6, Kyoung Tak Cho7, David A Schott7, Bremen L Braun1, Miranda Dietze6, Brittney Dunfee6, Christine G Elsik3,4, Nancy Manchanda2, Ed Coe4, Marty Sachs8, Philip Stinard8, Josh Tolbert8, Shane Zimmerman8, Carson M Andorf1.
Abstract
Since its 2015 update, MaizeGDB, the Maize Genetics and Genomics database, has expanded to support the sequenced genomes of many maize inbred lines in addition to the B73 reference genome assembly. Curation and development efforts have targeted high quality datasets and tools to support maize trait analysis, germplasm analysis, genetic studies, and breeding. MaizeGDB hosts a wide range of data including recent support of new data types including genome metadata, RNA-seq, proteomics, synteny, and large-scale diversity. To improve access and visualization of data types several new tools have been implemented to: access large-scale maize diversity data (SNPversity), download and compare gene expression data (qTeller), visualize pedigree data (Pedigree Viewer), link genes with phenotype images (MaizeDIG), and enable flexible user-specified queries to the MaizeGDB database (MaizeMine). MaizeGDB also continues to be the community hub for maize research, coordinating activities and providing technical support to the maize research community. Here we report the changes MaizeGDB has made within the last three years to keep pace with recent software and research advances, as well as the pan-genomic landscape that cheaper and better sequencing technologies have made possible. MaizeGDB is accessible online at https://www.maizegdb.org.Entities:
Mesh:
Year: 2019 PMID: 30407532 PMCID: PMC6323944 DOI: 10.1093/nar/gky1046
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 2.This chart tracks the number of genome assemblies in MaizeGDB’s database since 2003. The first genome integrated into MaizeGDB was the B73 BAC-based assembly in November 2008, and until late 2016 additional genomes only included improved versions of B73. Following the release of B73 RefGen_v4, several assemblies of different maize lines have been sequenced de novo and incorporated into MaizeGDB. We anticipate this number to increase substantially as the cost of sequencing continues to fall, especially with the upcoming release of the 25 NAM founder lines. By 2020 we expect to be hosting ∼40 genomes.
Figure 3.This image is a composite of the five new tools at MaizeGDB. Section (A) shows a MaizeMine results page for a particular gene model, which includes information on transcripts/proteins, expression, function, pathways, homology, and publications. Section (B) shows two images from qTeller. The top image show the expression values of a particular gene model across all of the tissues in the database. The bottom image shows a scatterplot of FPKM values between two gene models. Section (C) shows the results of a SNPversity query, which indicate the major/minor allele for a given inbred line at the top of the table and whether each SNP is located within a gene model's exon, intron or intergenic region. Section (D) shows the MaizeDIG interface, which allows curators to tag phenotypes in the image and link them to particular genes. The box outline in the middle of the image shows an example of what a tagged phenotype looks like. Section (E) shows the results of a query in the Pedigree Viewer of all stocks that were developed in Illinois.
This table shows the data centers and their content in MaizeGDB. Each data center groups similar data types for ease of custom querying, which also helps our users find what they need faster
| Data Center | Description | Record Count |
|---|---|---|
| Alleles/polymorphisms | Queryable data on alleles of known genes. | 810,229 alleles |
| BACs | Queryable BACs from the Maize Mapping Project isolated from B73 | 439,464 BACs |
| Cytogenetics | Cytogenetic data including: maps, knobs, centromeres, telomeres, karyotypes, stocks, and other resources | N/A |
| Diversity | Various tools and resources for downloading and querying diversity, SNP, and trait data | 419,061 traits |
| Expression | Lists of atlas type expression data | N/A |
| Genes/Gene Models | Queryable genes and gene models from all genomes hosted at MaizeGDB. Additional information about the reference annotation for B73 can also be found here. | 13,468 genes / 598,794 gene models |
| Gene Products | Queryable gene products. | 1,992 gene products |
| Images | Queryable images of traits, phenotypes, pests, gel patterns, and mutants | 10,115 images |
| Loci/QTLs | Queryable loci of all locus types. | 214,464 loci |
| Maps | Queryable genetic maps. | 2,117 maps |
| Metabolic Pathways | Metabolic pathway data and CornCyc 9.0 | 485 pathways (B73v3) / 505 pathways (B73v4) |
| Molecular Markers | Queryable markers of all probe types (BACs, ESTs, SSRs, etc.). | 771,136 markers |
| Phenotypes | Queryable phenotype and mutant data. | 1,120 phenotypes |
| References | Queryable references of all publication types (articles, reports, abstracts, books, etc). | 47,584 references |
| Stocks | Queryable stocks with links to order them from their respective repositories where applicable. | 66,825 stocks |
Figure 1.This chart tracks the number of records in MaizeGDB’s database for the listed data types since 2003. Initially there were a comparable number of records for Loci, References, Stock, and Variation data types until July of 2010, when hundreds of thousands of Variation records from the first maize HapMap (43) were added to the database. Gene model records were not introduced until the release of the first maize reference assembly, B73 RefGen_v1 in late 2009, and then began to rise sharply in late 2016 with the release of each new maize genome assembly (see Figure 2). The number of Phenotype records has not changed significantly from 1,016 in 2003 to 1,136 in August of 2018.
This table depicts all of the datasets loaded into MaizeMine as of this publication.
| Data Type | Description | Species | PubMed | Link |
|---|---|---|---|---|
|
| Maize community gene set for B73v4 |
|
|
|
| Maize community gene set for B73v3 |
|
|
| |
| NCBI annotation (RefSeq and Gene) |
|
|
| |
|
| Orthologue and paralogue relationships |
|
|
|
|
| Protein annotations from UniProt |
|
|
|
| Protein family and domain assignments to proteins from Interpro |
|
|
| |
|
| GO annotations |
|
|
|
|
| Pathway information from Plant Reactome |
|
|
|
| Pathway information from KEGG |
|
|
| |
| Pathway information from CornCyc 8.0 |
|
|
| |
|
| A mapping from genes to publications |
|
| |
|
| Gene expression computed on reference gene set RefSeq, AGPv3, and AGPv4 |
|
|
|
|
| Chromosome assembly for B73v4 |
|
|
|
| Chromosome assembly for B73v3 |
|
|
|