| Literature DB >> 24194595 |
Gustavo C Cerqueira1, Martha B Arnaud, Diane O Inglis, Marek S Skrzypek, Gail Binkley, Matt Simison, Stuart R Miyasato, Jonathan Binkley, Joshua Orvis, Prachi Shah, Farrell Wymore, Gavin Sherlock, Jennifer R Wortman.
Abstract
The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.Entities:
Mesh:
Year: 2013 PMID: 24194595 PMCID: PMC3965050 DOI: 10.1093/nar/gkt1029
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Incorporation of genomes into AspGD
Statistics regarding COGs: total number of clusters and number of clusters conserved across all species hosted in AspGD.
Statistics of gene model updates
| Number of genes updated by each modification type | ||||
|---|---|---|---|---|
| Total number of genes in the genome | 10 982 | 12 176 | 10 073 | 10 106 |
| Total number of updated genes | 7729 (70%) | 8390 (69%) | 5183 (51%) | 4854 (48%) |
| Merged genes | 36 (now 18) | 284 (now 138) | 28 (now 14) | 36 (now 18) |
| Altered coding sequence | 1340 (12%) | 1930 (16%) | 1685 (17%) | 1422 (14%) |
| Extended 5′UTR | 7043 (64%) | 7125 (59%) | 4534 (45%) | 4201 (42%) |
| Extended 3′UTR | 7289 (66%) | 6336 (52%) | 3548 (35%) | 3560 (35%) |
| Terminal exons added | 750 (7%) | 1182 (10%) | 1255 (12%) | 951 (9%) |
| Introns added or modified | 904 (8%) | 1188 (10%) | 1133 (11%) | 919 (9%) |
Percentages are relative to the total number of genes in the genome of each species or strain.
Figure 1.Examples of gene structural modifications supported by RNA-Seq data for A. nidulans, A. oryzae and A. fumigatus genomes. (A) Gene models with new exons added based on transcription evidence. (B) Gene models that were merged based on RNA-Seq evidence. Colored horizontal bars represent either gene features or RNA-Seq read alignments as described in the legend. A strand-specific RNA-Seq data set is shown in the example featuring A. nidulans gene AN4239.
Figure 2.Enhancements to the website navigation and integration with JBrowse and GenomeView genome browsers. (A) New look and feel of AspGD user interface with updated navigation bar. (B) JBrowse instance depicting genes and RNA-Seq reads aligned to A. oryzae chromosome 7: red and blue rectangles on the bottom track indicate reads aligned to the plus and minus strand, respectively. (C) GenomeView instance showing the genomic context of A. nidulans gene AN11070. The RNA-Seq aligned reads are represented by green (plus strand) and blue (minus strand) horizontal bars in the bottom panel. Pink horizontal bars indicate alignment gaps across intronic regions.