| Literature DB >> 22140109 |
Philippe Lamesch1, Tanya Z Berardini, Donghui Li, David Swarbreck, Christopher Wilks, Rajkumar Sasidharan, Robert Muller, Kate Dreher, Debbie L Alexander, Margarita Garcia-Hernandez, Athikkattuvalasu S Karthikeyan, Cynthia H Lee, William D Nelson, Larry Ploetz, Shanker Singh, April Wensel, Eva Huala.
Abstract
The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is a genome database for Arabidopsis thaliana, an important reference organism for many fundamental aspects of biology as well as basic and applied plant biology research. TAIR serves as a central access point for Arabidopsis data, annotates gene function and expression patterns using controlled vocabulary terms, and maintains and updates the A. thaliana genome assembly and annotation. TAIR also provides researchers with an extensive set of visualization and analysis tools. Recent developments include several new genome releases (TAIR8, TAIR9 and TAIR10) in which the A. thaliana assembly was updated, pseudogenes and transposon genes were re-annotated, and new data from proteomics and next generation transcriptome sequencing were incorporated into gene models and splice variants. Other highlights include progress on functional annotation of the genome and the release of several new tools including Textpresso for Arabidopsis which provides the capability to carry out full text searches on a large body of research literature.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22140109 PMCID: PMC3245047 DOI: 10.1093/nar/gkr1090
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Arabidopsis thaliana Gene Ontology Annotations
| GO aspect | Experimental (%) | All evidence (%) | Unknown (%) | Not annotated (%) |
|---|---|---|---|---|
| Biological process (BP) | 5826 (20) | 15 644 (54) | 9764 (34) | 3367 (12) |
| Molecular function (MF) | 3816 (13) | 16 504 (57) | 8732 (30) | 3539 (12) |
| Cellular component (CC) | 7762 (27) | 15 383 (54) | 7529 (26) | 5863 (20) |
| BP, MF or CC | 10 595 (37) | 22 047 (77) | n/a | 939 (3) |
Number of A. thaliana genes with annotations to the three GO aspects and their percentages relative to the total number of genes excluding pseudogenes and transposable element genes, based on the TAIR10 genome release. ‘Experimental’ category includes genes annotated with evidence codes IDA (inferred from direct assay), IMP (inferred from mutant phenotype), IGI (inferred from genetic interaction), IPI (inferred from physical interaction) and IEP (inferred from expression profile). ‘All evidence’ includes all evidence codes except ND (no biological data available). ‘Unknown’ includes genes annotated to the GO root term within the indicated category using the ND evidence code. ‘Not annotated’ includes genes with no annotation to date within the indicated GO category. Numbers as of 15 September 2011; n/a not applicable.
aGenes with no GO annotation of any kind.
Figure 1.Overview of TAIR genome releases. (A) Bar graph displaying the number of annotation updates made in each of the 5 TAIR releases. Colored bars represent four different classes of updates: updated genes (light green), genes with CDS updates (orange), new genes (yellow) and new splice variants (dark green). (B) Table comparing the TAIR genome releases by types of data and prediction tools used, areas of focus and genome sequence updates. The red line separating TAIR8 from TAIR9 indicates that coordinates of most genes shifted in the TAIR9 release, as a consequence of the integration of 341 Indels, and the normalization of previously identified sequence contaminations to a standard length of 100 bp. A liftover tool is available at ftp://ftp.arabidopsis.org/home/tair/Software/UpdateCoord/ for updating coordinates of objects mapped to TAIR8 or earlier releases.
TAIR10 genome statistics
| Protein coding | pre-tRNA | rRNA | snRNA | snoRNA | miRNA | Other RNA | Pseudo | TE | Total | |
|---|---|---|---|---|---|---|---|---|---|---|
| gene | gene | |||||||||
| Nuclear | 27 206 | 631 | 4 | 13 | 71 | 177 | 394 | 924 | 3903 | 33 323 |
| Chloroplast | 88 | 37 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 133 |
| Mitochondrial | 122 | 21 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 146 |
| Total | 27 416 | 689 | 15 | 13 | 71 | 177 | 394 | 924 | 3903 | 33 602 |
Number of genes of each category in the TAIR10 genome release.