Literature DB >> 29165610

Gramene 2018: unifying comparative genomics and pathway resources for plant research.

Marcela K Tello-Ruiz¹, Sushma Naithani², Joshua C Stein¹, Parul Gupta², Michael Campbell¹, Andrew Olson¹, Sharon Wei¹, Justin Preece², Matthew J Geniza², Yinping Jiao¹, Young Koung Lee^1,3, Bo Wang¹, Joseph Mulvaney¹, Kapeel Chougule¹, Justin Elser², Noor Al-Bader², Sunita Kumari¹, James Thomason¹, Vivek Kumar¹, Daniel M Bolser⁴, Guy Naamati⁴, Electra Tapanari⁴, Nuno Fonseca⁴, Laura Huerta⁴, Haider Iqbal⁴, Maria Keays⁴, Alfonso Munoz-Pomer Fuentes⁴, Amy Tang⁴, Antonio Fabregat⁴, Peter D'Eustachio⁵, Joel Weiser⁶, Lincoln D Stein⁷, Robert Petryszak⁴, Irene Papatheodorou⁴, Paul J Kersey⁴, Patti Lockhart⁸, Crispin Taylor⁸, Pankaj Jaiswal², Doreen Ware^1,9.

Abstract

Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

Entities: CellLine Chemical Disease Species

Mesh：

Year: 2018 PMID： 29165610 PMCID： PMC5753211 DOI： 10.1093/nar/gkx1111

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The Gramene database is a versatile resource for querying, visualizing, analyzing, and comparing plant genome and pathway data across crops and model species. Our genomic data portal was produced in collaboration with Ensembl Genomes (1,2), and shares infrastructure, specialized software components and pre-computed data with Ensembl Plants. For example, the portal uses Ensembl's data model and analysis workflows to generate baseline genome annotations and perform genome-wide comparative analysis in order to construct phylogenetic trees, synteny, and whole-genome alignments for 44 plant reference genomes; the Ensembl genome browser to visualize and explore genomic data; and Ensembl online tools such as BLAST, BioMart, assembly converter, and the Variant Effect Predictor (VEP) for data analysis. The Plant Reactome database (http://plantreactome.gramene.org) (3,4), Gramene's pathways portal, is an ongoing development effort in collaboration with the human Reactome project (5). The Plant Reactome hosts pathway data for 67 species representing several model and crop plants, algae, and unicellular photoautotrophs. Rice is used as the reference species for manual curation of pathways and to derive gene orthology–based projections for other species, as described previously (3,4,6). Curated rice pathway data are integrated in the central pathway curation database and maintained by the human Reactome project. In addition, Gramene's gene pages and the Plant Reactome pathway browser make use of the whole-plant and seed anatomograms produced by the EMBL-EBI’s Expression Atlas to display baseline gene expression. All data in the Gramene database can be downloaded in graphical and/or tabular form in various standard formats and are also accessible programmatically via Application Programming Interfaces (APIs). Gramene also continues to provide access to archived legacy data (http://www.gramene.org/archive) (6) including Pathway Tools–based metabolic networks (7–9) from CyVerse (10). In this article, we describe the new data and functionalities in the Gramene database introduced since our last NAR report (6). A summary is provided in Table 1.

Table 1.

New data and functionalities in the Gramene database, by release (September 2015 - August 2017)

NEW GRAMENE HOMEPAGE AND INTEGRATED SEARCH INTERFACE

A brand new website homepage with an integrated search interface, built on the React Javascript framework to support development of client-side visualizations, was released in May 2016 (build 53). The new homepage has a type-ahead search box to facilitate quick search and links to our various portal and resources. The genomic distribution of genes in the search results are summarized and individual genes are displayed in an expandable list with embedded views of gene structure, gene expression, conservation, associated pathways, and database cross-references, described later. When a query only matches genes in well-annotated model organisms, the homology tab can be used to modify the search to show evolutionarily related genes. The main feature of the homology tab is an integrated visualization component that displays the expanded gene tree showing the gene of interest and its closest well-annotated homolog. Additional information is displayed as a track next to each leaf node or collapsed subtree. The default display mode shows regions of proteins included in the multiple sequence alignment, with color-coded InterPro domains. Users can pan and zoom the view or switch to a scrollable multiple-sequence alignment mode. Because navigating search results or a gene tree with hundreds of family members from dozens of species can be overwhelming, and is unnecessary for most users, we implemented a filter to limit search results by a user-defined subset of species. Gene tree pruning removes excess gaps from the multiple sequence alignment, thereby eliminating visual clutter from less closely related species. These developments add value beyond the standard views by integrating diverse data in fast interfaces, while giving users the ability to focus on the species most relevant to their research. All code developed for the web service and the search interface are maintained on GitHub (https://github.com/warelab).

NEW AND UPDATED PLANT GENOMES

The current release of Gramene added five new fully sequenced reference plant genomes and updated four existing genomes (Figure 1A, Table 1, and Supplementary Table S1), bringing the total to 44. The new species include three dicots, Beta vulgaris subsp. Vulgaris (sugar beet), Brassica napus (oilseed rape) and Trifolium pratense (red clover), and two red algae, Chondrus crispus (Irish or carrageen moss) and Galdieria sulphuraria, an extremophilic unicellular species. All reference genome assemblies in Gramene are accessioned in the International Nucleotide Sequence Database Collaboration, INSDC (11), a policy that ensures provenance and interoperability with other online resources. The complete list of genomes in Gramene, shown in Figure 1A, includes 15 eudicots, 21 monocots, 1 basal angiosperm, and 7 non-flowering species. The four updated assemblies correspond to three staple food crops, Zea mays cv. B73 (maize), Triticum aestivum cv. Chinese Spring (wheat), and Sorghum bicolor cv. BTx623 (sorghum), and a wild rice indigenous to sub-Saharan Africa, Oryza longistaminata. Gramene researchers led the innovative effort to generate the new maize reference assembly (B73 RefGen_v4), constructed entirely from third-generation long-read sequence technology with the aid of an optical map (12).

Figure 1.

Visualization of query results in the new Gramene database search interface. (A) Species tree of 44 plant reference genomes available in Gramene build 54. Blue, red, black and green lines correspond respectively to 21 monocots, 15 dicots, the basal angiosperm Amborella and 7 lower plant species. New genome assemblies are marked with a red asterisk (*), and updated assemblies are highlighted in violet font (see also Table 1 and Supplementary Table S1). (B) Customized gene tree alignment views, color coded by InterPro domain. Filtered query results and gene tree branches are shown for selected species. Filter query results and gene tree branches are shown for selected species. (C) Detailed DNA sequence alignment. (D) Oryza sativa QTLs, SSRs, and RFLPs from legacy sets reincorporated to the IRGSP1 assembly (mapping courtesy of KeyGene). QTLs from Q-TARO (Yonemaru et al, 2010) are also available. The assembly was released with a new set of gene annotations generated by the MAKER-P pipeline (13) from 111 000 long-read Iso-Seq transcripts obtained by single-molecule sequencing (14). This approach more than doubled the number of alternative transcripts from 1.5 to 3.8 per gene, resolved gaps and assembly errors, corrected strand, consolidated gene models, and anchored previously unanchored genes (12). The new bread wheat assembly, TGACv1, produced by the Earlham Institute (formerly TGAC), includes ∼99% of the genes remapped from the prior IWGSCv1 Chromosome Survey Sequence assembly (15). To facilitate transformation of genomic coordinates between assemblies, we continue to maintain an easy-to-use online assembly converter tool. For example, using this utility, it is now possible to traverse across the three maize B73 reference assemblies, RefGen_v2, RefGen_v3, and RefGen_v4 (http://ensembl.gramene.org/Zea_mays/Tools/AssemblyConverter?db=core).

UPDATED ANNOTATION AND COMPARATIVE GENOMICS

Genes and their encoded proteins are functionally annotated in Gramene using InterProScan (16), allowing comparison of InterPro functional domains and Gene Ontology terms within a gene-centered phylogenomic framework. Release 54 includes revised gene models for the updated maize, wheat and sorghum genome sequence assemblies, as well as new gene models for Arabidopsis thaliana (Araport 11; https://www.araport.org/data/araport11), although the genome assembly for the later was not updated. Our phylogenomic analysis utilizes the Compara method (17) to reconstruct the evolutionary history of genes by clustering homologous genes into families, building gene trees with stable IDs for future reference, and classifying pairwise orthologous and paralogous relationships. Each node of a consensus tree is assigned a taxonomic date to identify points at which gene duplications occurred within evolving lineages, as well as specify the common ancestor in which the gene family emerged. For closely related species, synteny maps are constructed to reveal regions of conserved gene order (18,19). These maps are complemented by whole-genome alignments showing conserved intergenic, as well as genic, regions (19). These data can be visualized using the web-based Ensembl browser (20,21) and mined using BioMart (22,23), both of which are accessible programmatically. These data have been used for a wide range of applications, including the study of protein–protein interaction network evolution in Arabidopsis (24), development of methods for enrichment of candidate genes in genome-wide association studies in maize (25), and construction of genome-scale knowledge networks in wheat and barley (26). In addition, Compara analysis allows us to identify gene annotation errors (split gene models), suggest functional annotations for less well-annotated orthologs, and project mappings from curated rice pathways to other species.

NEW PLANT GENETIC DIVERSITY AND MUTAGENIZED SEQUENCE VARIATION

Gramene continues to provide single-nucleotide polymorphism (SNP) and/or structural variation data for 12 genomes (6,27–42; http://1001genomes.org). We recently added new variants for A. thaliana, Japonica rice, O. glumaepatula, sorghum and wheat. Since our last NAR update, we added a combined total of over 8.8 million ethyl methanesulfonate (EMS)-derived mutations for sorghum and wheat. The sorghum dataset (41) includes ∼1.5 million EMS-induced G/C to A/T transition mutations, annotated from 252 M3 families selected from a 6400-mutant library in the BTx623 background. The wheat dataset comprises ∼7.4 million EMS-type variants derived from sequencing of tetraploid (2.8 million cv. ‘Kronos’) and hexaploid (4.6 million cv ‘Cadenza’) TILLING populations. Mutations were originally called in the IWGSC CSS scaffolds, as described in Krasileva et al. (42), and the high-confidence (HetMC5HomMC3) mutations were then projected onto the TGAC Chinese Spring scaffolds. The wheat EMS mutant data are also available in a specialized database generated by a joint project between the University of California Davis, Rothamsted Research, The Earlham Institute, and the John Innes Centre. Researchers and breeders can search this online database, identify mutations in different copies of their target genes, and request seeds for use in studies of gene function or improvement of wheat varieties via the project online search tools at http://www.wheat-tilling.com and http://dubcovskylab.ucdavis.edu/wheat-tilling. Seed requests can also be made from the UK SeedStor resource (https://www.seedstor.ac.uk/shopping-cart-tilling.php). The number of O. glumaepatula variants from the Oryza Genome Evolution project (Stein et al., Submitted) is currently 4.9 million. For assembly updates of maize, wheat, and sorghum, variants were remapped to the latest genomic coordinate system (e.g. HapMap SNPs for maize, wheat and inter-homoeologous wheat variants). We have also incorporated rice QTLs from Gramene's archives (6) (http://archive.gramene.org/qtl) and the Q-TARO database (43; http://qtaro.abr.affrc.go.jp), as well as legacy rice SSR/RFLP data from our archives (http://archive.gramene.org/markers) remapped to the IRGSPv1 assembly. We continue to analyze and assign putative functional and structural consequences to gene variants using the Ensembl VEP tool (4,44). Visualization of these consequences is provided in the context of transcript structure and protein domains. For many studies, we also provide information on the genotypes of individual plant accessions and their phenotypes.

PLANT GENE EXPRESSION ATLAS

The plant gene Expression Atlas (https://www.ebi.ac.uk/gxa/plant/experiments) contains transcriptomic data from 731 experiments in 18 plant species. These data have been manually curated, quality-controlled, and analyzed using standardized analysis pipelines (45). Baseline expression data from RNA-seq experiments from 14 plant species show expression levels of gene products under ‘normal' conditions in various tissues (leaves, roots, etc.), developmental stages, cultivars, and ecotypes. The baseline expression profile of an individual gene across all tissue samples and growth stages, from EMBL-EBI Expression Atlas, can be accessed from the gene page in the Gramene database, as well as from the Plant Reactome Pathway Browser. Differential gene expression data, including responses to environmental stresses, genetic mutations, and bacterial infection, are available for over 2000 manually curated pairwise comparisons from 15 plant species, and include data from both microarray and RNA-seq experiments. At present, differential expression data can be viewed on the Expression Atlas website, and displayed on demand (not automatically) on the Gramene gene page and the Genome Browser as a genome feature track. Additional features include, visualization of GO, Pathway and Interpro domain enrichments in the given expression data. Baseline experiment page allows users to find genes with similar expression profiles. Moreover, to achieve adequate annotation of all experiments in Expression Atlas, the EMBL-EBI Experimental Factor Ontology (46) (EFO) has been supplemented with Plant Ontology (47,48) (PO) and Brenda Tissue and Enzyme Source Ontology (EFO). An automatic, scalable framework is used to propagate manually curated ontology annotations to matching sample attributes, ensuring that all new plant experiments loaded into Expression Atlas in the future will benefit from the existing ontology annotations without further manual curation. On a daily basis, an automatic analysis pipeline discovers new RNA-seq runs in 38 plant species in the European Nucleotide Archive, and then performs quality control, aligns them to the genome reference in Ensembl Plants, and quantifies gene and exon expression. The pipeline also re-aligns all runs in a given species when a new genome assembly is released. To date, 33 000 runs have been processed, and the results are available via the RNASeq-er API (http://www.ebi.ac.uk/fg/rnaseq/api), as well as a BioPython module and a CPAN Perl library. All experimental data in Expression Atlas are available for download as R objects from the corresponding experiment pages. Experiments can also be found via an ontology-powered search and retrieved from within the ExpressionAtlas Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/ExpressionAtlas.html). Finally, high-quality anatomical illustrations in Scalable Vector Graphics (SVG) format for the plant species in Expression Atlas have been created by a professional medical illustrator and deployed to highlight tissues with expression within the baseline expression views.

ENHANCED PLANT REACTOME PORTAL

The Plant Reactome database (3,4) (http://plantreactome.gramene.org) is a unifying resource for describing pathways associated with plant metabolism, development, and differentiation, and their regulation in response to various environmental stimuli. The Plant Reactome uses Japonica rice (O. sativa) as a reference for manual curation of pathways within the framework of plant cell architecture. In the current release, the database contains 241 reference pathways, including 229 curated and 12 predicted rice pathways; the latter were derived from projections of curated human pathways associated with evolutionarily conserved processes including DNA replication, vesicle transport, and translation. Since our last report, we added ∼25 new manually curated rice pathways, including choline biosynthesis, cardiolipin biosynthesis, glycine betaine biosynthesis, suberin biosynthesis, tryptophan biosynthesis, tyrosine biosynthesis, lysine degradation II, inter- and intra-cellular auxin transport, response to cold temperature, circadian rhythm, inflorescence development, and transition from vegetative to reproductive shoot apical meristem. The reference set of rice pathways is used to generate gene orthology–based pathway projections for additional species (4). Since our last NAR report, the number of species available in the Plant Reactome doubled from 33 to 66, corresponding to all 44 species with sequenced reference genomes in Gramene and 22 additional species with publicly available sequenced genomes and/or transcriptomes (http://plantreactome.gramene.org/stats.html). The presence or absence of orthologs drives the projection of homologous reactions and pathway maps for a given species. Users can access pathway data in graphical and tabular format for any member species and compare the projected pathways with reference rice pathways. As shown in Figure 2, our new ‘Fireworks’ pathway visualization platform displays hierarchical graphs of pathways, allows exploration of pathways by functional categories, links the Pathway Browser to various public external resources and genome databases that provide complementary information about various pathway entities, displays expression profiles of genes (fetched remotely from EMBL-EBI’s gene Expression Atlas), and provides the option to display gene–gene interaction data from EBI’s INTACT database via PSICQUIC web services. We also support upload, visualization, and analysis of user-defined transcriptome, proteome, metabolome and gene–gene interaction data, which are accessible for download in various standard formats.

Figure 2.

Plant Reactome pathway views and functionalities. (A) The ‘Fireworks’ pathway visualization platform displays a navigation panel (left) and a hierarchical graph of pathways in a pathway viewer window with various associated features, such as summation, download options, and links to external public resources providing complementary information on various pathway entities. (B) A view of the Plant Reactome pathway displaying an overlay of gene–gene interaction data. Two common interactors of the MADS15 and MADS14 transcription factors (MAD1 ORYS and MAD6 ORYS, shown in red boxes) were imported via web services from the IntAct database. (C) Analysis and visualization of user-uploaded gene expression data on the pathway browser, with options to explore full expression profiles of the homologs and download results.

INTEGRATED SEARCH

For each release of Gramene, we run pipelines to extract data and annotation terms from Ensembl, Reactome, Expression Atlas, and other external reference resources, transform them into JSON documents, and load them into MongoDB collections. The documents in the genes collection, initially generated from Ensembl core MySQL databases, were extended to include homology information from Ensembl Compara, relevant reactions and pathways from Plant Reactome, and associated ontology terms. Ontologies, the InterPro domain hierarchy, the pathway hierarchy, and the taxonomy tree have inherent structures in which parent terms are more general than their children. When a gene is associated with a specific term, we also include any ancestor terms from the corresponding ontology in our index, enabling users to find genes associated with any less specific, yet related, term. We also integrated InterPro domain annotations into gene tree documents to support their display in our gene tree browser. To support free text search and complex combinations of filters, we transformed the collection of gene documents for use in Apache Solr. To power the type-ahead search feature, we also prepared an index of suggestions derived from the various MongoDB collections. Each suggestion document includes fields that define a filter on the genes index. This design guides users to choose terms drawn from controlled vocabularies and encourages them to iteratively refine the search by adding or modifying filters. We use swagger (http://swagger.io) to define, document and serve our APIs, which access data in the MongoDB collections and Solr cores via http://data.gramene.org. We deploy customized installations of Ensembl and Reactome REST APIs to support the build pipeline and specific visualizations in the search interface.

COMMUNITY OUTREACH AND TRAINING

Gramene organizes community outreach activities aimed at training plant biologists at various stages of their careers, including high school students and faculty, undergraduate and graduate students, postdoctoral fellows, technical staff, database managers, senior researchers, and group leaders. We regularly publicize our database updates, meeting reports, new tools, and online or on-site training activities via the Gramene News blog, Facebook and Twitter (see http://gramene.org/outreach). We organize regular online webinars and provide recorded video tutorials on the Gramene Youtube channel (https://goo.gl/ln9RLD) to train researchers to use our resources and bioinformatics tools, as well as to access data from public repositories to analyze using Gramene's comparative genomic and pathway tools. We conducted annual on-site workshops during the Plant and Animal Genome Conference and the Plant Biology meeting to introduce recent updates and new Gramene features to the plant community and provide personalized assistance to our users. We also collaborate with various plant researchers and public databases (e.g. AgBioData Consortium, GrapeIS (49), WheatIS) on development and improvement of data standards.

DISCUSSION AND FUTURE DIRECTIONS

In this report, we described two major improvements made to our website since our last NAR update: Gramene's new homepage with integrated search interface, and improved views and functionalities for Plant Reactome. Our current homepage supports simultaneous search of the contents of all Gramene portals, and produces summarized output with clickable links allowing detailed exploration. The Plant Reactome now utilizes the ‘Fireworks’ pathway visualization platform for pathway organization and navigation and provides users an option to overlay gene–gene interaction data (fetched remotely via APIs from external resources or by uploading their own data) on the pathway diagrams. In the past 2 years, the number of species covered, with corresponding annotations, has increased across all portals of Gramene (Genome Browsers, Plant Reactome, Expression Atlas, outreach and training material, etc.). Gramene will continue to develop value-added genomic and pathway resources, and extend and develop the platform across all data types. In the future, we expect to surpass 75 annotated genomes and corresponding pathway projections in Plant Reactome, and to include expression datasets for these new species. In addition, we will produce sub-sites focused on important crop species with multiple sequenced genomes (i.e. pangenomes), such as maize, sorghum, rice, wheat, and grape, to provide pan-species views and facilitate comparative analysis. We will improve and develop analysis and visualization tools with special emphasis on inter- and intra-specific analysis of small- and large-scale data sets generated from global crops and emerging plant model species. We will continue to improve our integrated search interface with enhancements to the user interface, new visualizations for interactive search result summaries, and integrated gene tree display modes for exploration of functional conservation. In addition, we will continue to train current and future plant biologists on how to generate accurate and reliable data sets, as well as how to use, re-use, and analyze genomic data. Gramene is working with plant researchers and other genomics communities to develop standard formats and guidelines for making heterogeneous data sets, including genes and genomes, germplasm, and high-throughput gene expression data, phenotypes, metabolomes, and proteomes, and annotations, accessible, reusable, and interoperable. We support the efforts of plant biologists to learn how to accurately annotate and format their data sets using community standards and ontology concepts. One of the key challenge is on scaling curation activities. To address this effort, Gramene has an ongoing collaboration with the American Society for Plant Biologists (ASPB) and its journals Plant Physiology and The Plant Cell to improve adoption of standards for data formatting and annotation during manuscript production. Validating and integrating data that are published in journals with data that reside in databases will improve the utility of both. Through our monthly webinar series, and onsite workshops, we will continue to assist researchers in taking full advantage of the available genomic resources and seek their suggestions for improving our tools, data models, and resources. Our hope is that our users will apply these skills for construction of novel hypotheses, validation of existing knowledge, and characterization of all aspects of plant genes, including their functions, expression, roles in pathways, phenotypes, and associated functional and structural variations. Click here for additional data file.

48 in total

1. Population genomic and genome-wide association studies of agroclimatic traits in sorghum.

Authors: Geoffrey P Morris; Punna Ramu; Santosh P Deshpande; C Thomas Hash; Trushar Shah; Hari D Upadhyaya; Oscar Riera-Lizarazu; Patrick J Brown; Charlotte B Acharya; Sharon E Mitchell; James Harriman; Jeffrey C Glaubitz; Edward S Buckler; Stephen Kresovich
Journal: Proc Natl Acad Sci U S A Date: 2012-12-24 Impact factor: 11.205

2. A physical, genetic and functional sequence assembly of the barley genome.

Authors: Klaus F X Mayer; Robbie Waugh; John W S Brown; Alan Schulman; Peter Langridge; Matthias Platzer; Geoffrey B Fincher; Gary J Muehlbauer; Kazuhiro Sato; Timothy J Close; Roger P Wise; Nils Stein
Journal: Nature Date: 2012-10-17 Impact factor: 49.962

3. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations.

Authors: Michael S Campbell; MeiYee Law; Carson Holt; Joshua C Stein; Gaurav D Moghe; David E Hufnagel; Jikai Lei; Rujira Achawanantakun; Dian Jiao; Carolyn J Lawrence; Doreen Ware; Shin-Han Shiu; Kevin L Childs; Yanni Sun; Ning Jiang; Mark Yandell
Journal: Plant Physiol Date: 2013-12-04 Impact factor: 8.340

4. PICARA, an analytical pipeline providing probabilistic inference about a priori candidates genes underlying genome-wide association QTL in plants.

Authors: Charles Chen; Genevieve DeClerck; Feng Tian; William Spooner; Susan McCouch; Edward Buckler
Journal: PLoS One Date: 2012-11-07 Impact factor: 3.240

5. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines.

Authors: Susanna Atwell; Yu S Huang; Bjarni J Vilhjálmsson; Glenda Willems; Matthew Horton; Yan Li; Dazhe Meng; Alexander Platt; Aaron M Tarone; Tina T Hu; Rong Jiang; N Wayan Muliyati; Xu Zhang; Muhammad Ali Amer; Ivan Baxter; Benjamin Brachi; Joanne Chory; Caroline Dean; Marilyne Debieu; Juliette de Meaux; Joseph R Ecker; Nathalie Faure; Joel M Kniskern; Jonathan D G Jones; Todd Michael; Adnane Nemri; Fabrice Roux; David E Salt; Chunlao Tang; Marco Todesco; M Brian Traw; Detlef Weigel; Paul Marjoram; Justin O Borevitz; Joy Bergelson; Magnus Nordborg
Journal: Nature Date: 2010-03-24 Impact factor: 49.962

6. Genome-wide patterns of genetic variation in sweet and grain sorghum (Sorghum bicolor).

Authors: Lei-Ying Zheng; Xiao-Sen Guo; Bing He; Lian-Jun Sun; Yao Peng; Shan-Shan Dong; Teng-Fei Liu; Shuye Jiang; Srinivasan Ramachandran; Chun-Ming Liu; Hai-Chun Jing
Journal: Genome Biol Date: 2011-11-21 Impact factor: 13.583

7. Multiple reference genomes and transcriptomes for Arabidopsis thaliana.

Authors: Xiangchao Gan; Oliver Stegle; Jonas Behr; Joshua G Steffen; Philipp Drewe; Katie L Hildebrand; Rune Lyngsoe; Sebastian J Schultheiss; Edward J Osborne; Vipin T Sreedharan; André Kahles; Regina Bohnert; Géraldine Jean; Paul Derwent; Paul Kersey; Eric J Belfield; Nicholas P Harberd; Eric Kemen; Christopher Toomajian; Paula X Kover; Richard M Clark; Gunnar Rätsch; Richard Mott
Journal: Nature Date: 2011-08-28 Impact factor: 49.962

8. Developing integrated crop knowledge networks to advance candidate gene discovery.

Authors: Keywan Hassani-Pak; Martin Castellote; Maria Esch; Matthew Hindle; Artem Lysenko; Jan Taubert; Christopher Rawlings
Journal: Appl Transl Genom Date: 2016-11-02

9. Plant Reactome: a resource for plant pathways and comparative analysis.

Authors: Sushma Naithani; Justin Preece; Peter D'Eustachio; Parul Gupta; Vindhya Amarasinghe; Palitha D Dharmawardhana; Guanming Wu; Antonio Fabregat; Justin L Elser; Joel Weiser; Maria Keays; Alfonso Munoz-Pomer Fuentes; Robert Petryszak; Lincoln D Stein; Doreen Ware; Pankaj Jaiswal
Journal: Nucleic Acids Res Date: 2016-10-30 Impact factor: 16.971

10. The International Nucleotide Sequence Database Collaboration.

Authors: Yasukazu Nakamura; Guy Cochrane; Ilene Karsch-Mizrachi
Journal: Nucleic Acids Res Date: 2012-11-24 Impact factor: 16.971

57 in total

1. The Dominant and Poorly Penetrant Phenotypes of Maize Unstable factor for orange1 Are Caused by DNA Methylation Changes at a Linked Transposon.

Authors: Kameron Wittmeyer; Jin Cui; Debamalya Chatterjee; Tzuu-Fen Lee; Qixian Tan; Weiya Xue; Yinping Jiao; Po-Hao Wang; Iffa Gaffoor; Doreen Ware; Blake C Meyers; Surinder Chopra
Journal: Plant Cell Date: 2018-12-18 Impact factor: 11.277

2. A Sequence-Indexed Mutator Insertional Library for Maize Functional Genomics Study.

Authors: Lei Liang; Ling Zhou; Yuanping Tang; Niankui Li; Teng Song; Wen Shao; Ziru Zhang; Peng Cai; Fan Feng; Yafei Ma; Dongsheng Yao; Yang Feng; Zeyang Ma; Han Zhao; Rentao Song
Journal: Plant Physiol Date: 2019-10-21 Impact factor: 8.340

3. A cascade of bHLH-regulated pathways programs maize anther development.

Authors: Guo-Ling Nan; Chong Teng; John Fernandes; Lily O'Connor; Blake C Meyers; Virginia Walbot
Journal: Plant Cell Date: 2022-03-29 Impact factor: 11.277

4. Scripting Analyses of Genomes in Ensembl Plants.

Authors: Bruno Contreras-Moreira; Guy Naamati; Marc Rosello; James E Allen; Sarah E Hunt; Matthieu Muffato; Astrid Gall; Paul Flicek
Journal: Methods Mol Biol Date: 2022

5. Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

Authors: Marcela Karey Tello-Ruiz; Pankaj Jaiswal; Doreen Ware
Journal: Methods Mol Biol Date: 2022

6. CerealsDB: A Whistle-Stop Tour of an Open Access SNP Resource.

Authors: Mark Winfield; Paul Wilkinson; Amanda Burridge; Alexandra Allen; Jane Coghill; Christy Waterfall; Keith Edwards; Gary Barker
Journal: Methods Mol Biol Date: 2022

7. Standardized genome-wide function prediction enables comparative functional genomics: a new application area for Gene Ontologies in plants.

Authors: Leila Fattel; Dennis Psaroudakis; Colleen F Yanarella; Kevin O Chiteri; Haley A Dostalik; Parnal Joshi; Dollye C Starr; Ha Vu; Kokulapalan Wimalanathan; Carolyn J Lawrence-Dill
Journal: Gigascience Date: 2022-04-15 Impact factor: 7.658

8. Gene Ontology Meta Annotator for Plants (GOMAP).

Authors: Kokulapalan Wimalanathan; Carolyn J Lawrence-Dill
Journal: Plant Methods Date: 2021-05-25 Impact factor: 4.993

9. Using precision phenotyping to inform de novo domestication.

Authors: Alisdair R Fernie; Saleh Alseekh; Jie Liu; Jianbing Yan
Journal: Plant Physiol Date: 2021-07-06 Impact factor: 8.340

10. Subcellular Proteomics as a Unified Approach of Experimental Localizations and Computed Prediction Data for Arabidopsis and Crop Plants.

Authors: Cornelia M Hooper; Ian R Castleden; Sandra K Tanz; Sally V Grasso; A Harvey Millar
Journal: Adv Exp Med Biol Date: 2021 Impact factor: 2.622