| Literature DB >> 23681907 |
Anuradha Pujar1, Naama Menda, Aureliano Bombarely, Jeremy D Edwards, Susan R Strickler, Lukas A Mueller.
Abstract
High-quality manual annotation methods and practices need to be scaled to the increased rate of genomic data production. Curation based on gene families and gene networks is one approach that can significantly increase both curation efficiency and quality. The Sol Genomics Network (SGN; http://solgenomics.net) is a comparative genomics platform, with genetic, genomic and phenotypic information of the Solanaceae family and its closely related species that incorporates a community-based gene and phenotype curation system. In this article, we describe a manual curation system for gene families aimed at facilitating curation, querying and visualization of gene interaction patterns underlying complex biological processes, including an interface for efficiently capturing information from experiments with large data sets reported in the literature. Well-annotated multigene families are useful for further exploration of genome organization and gene evolution across species. As an example, we illustrate the system with the multigene transcription factor families, WRKY and Small Auxin Up-regulated RNA (SAUR), which both play important roles in responding to abiotic stresses in plants. Database URL: http://solgenomics.net/Entities:
Mesh:
Substances:
Year: 2013 PMID: 23681907 PMCID: PMC3655285 DOI: 10.1093/database/bat028
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Map viewer. Chromosome distribution of the 82 members of the tomato WRKY gene family. The shown locations are based on the physical position of each corresponding genome gene model, based on the ITAG2.3 annotation of the tomato genome.
Figure 2.Curation interface for associating a locus from an existing locus page. (A) A search for a locus name by the organism common name. (B) Select locus from the result list pop-up menu. (C) Select the relationship type from the pop-up menu. (D) Select evidence code for the described locus relationship. Adding a reference is optional. Only references associated with the loci involved are presented in the pop-up menu.
Figure 3.Curated gene family page. A family of putrescine N-methyltransferase (PMT) homologous genes from multiple Solanaceae organisms. Gene family details are editable, and curators can add members to the family in a similar manner as associating loci from the locus page, except for the relationship type, which is predefined. Gene family members are listed by organism with an evidence code.
Figure 4.Visualization of genes associated with drought tolerance in Solanaceae. A simple example network is shown. The nodes represent genes, and the edges represent relationships between the genes. The blue circular node is WRKY 39 and the green nodes having diamond shapes are genes associated with drought, annotated in the SGN database. The purple edges connecting WRKY 39 to the other genes represent the relationship based on the evidence code ‘Co-expressed’. These genes have been curated from an article reporting a transcriptomic study of drought response genes in tomato species. The deep gray colored edge connects WRKY 39 transcription factor to WRKY 1 (yellow circular node) in a paralogous relationship. Similarly all red colored nodes are the WRKY gene family members curated in SGN. This network diagram shows the genes in the SGN database that have been currently annotated to the GO term, ‘GO: response to water deprivation’.