| Literature DB >> 24272250 |
Kenji Akiyama1, Atsushi Kurotani, Kei Iida, Takashi Kuromori, Kazuo Shinozaki, Tetsuya Sakurai.
Abstract
Arabidopsis thaliana is one of the most popular experimental plants. However, only 40% of its genes have at least one experimental Gene Ontology (GO) annotation assigned. Systematic observation of mutant phenotypes is an important technique for elucidating gene functions. Indeed, several large-scale phenotypic analyses have been performed and have generated phenotypic data sets from many Arabidopsis mutant lines and overexpressing lines, which are freely available online. Since each Arabidopsis mutant line database uses individual phenotype expression, the differences in the structured term sets used by each database make it difficult to compare data sets and make it impossible to search across databases. Therefore, we obtained publicly available information for a total of 66,209 Arabidopsis mutant lines, including loss-of-function (RATM and TARAPPER) and gain-of-function (AtFOX and OsFOX) lines, and integrated the phenotype data by mapping the descriptions onto Plant Ontology (PO) and Phenotypic Quality Ontology (PATO) terms. This approach made it possible to manage the four different phenotype databases as one large data set. Here, we report a publicly accessible web-based database, the RIKEN Arabidopsis Genome Encyclopedia II (RARGE II; http://rarge-v2.psc.riken.jp/), in which all of the data described in this study are included. Using the database, we demonstrated consistency (in terms of protein function) with a previous study and identified the presumed function of an unknown gene. We provide examples of AT1G21600, which is a subunit in the plastid-encoded RNA polymerase complex, and AT5G56980, which is related to the jasmonic acid signaling pathway.Entities:
Keywords: Arabidopsis; Database; Gene function; Mutant line; Ontology; Phenotype
Mesh:
Year: 2013 PMID: 24272250 PMCID: PMC3894705 DOI: 10.1093/pcp/pct165
Source DB: PubMed Journal: Plant Cell Physiol ISSN: 0032-0781 Impact factor: 4.927
Sources of data for the mutant lines
| Source | No. of total lines | URL |
|---|---|---|
| RIKEN Arabidopsis | 17,198 | |
| Cold Spring Harbor Laboratory Arabidopsis gene trap mutant lines | 16,337 | |
| RIKEN Arabidopsis full-length cDNA overexpressed Arabidopsis lines | 14,069 | |
| RIKEN rice full-length cDNA overexpressed Arabidopsis lines | 18,605 | |
| Total | 66,209 |
Fig. 1Annotation workflow of loss- and gain-of-function lines. To deduce disrupted and induced genes, we employed independent methods for loss- and gain-of-function lines. For loss-of-function lines (A), we determined a single transposon insertion point and then (B) deduced the genes in which the transposon was inserted into the gene or promoter region. For gain-of-function lines (C), we deduced the introduced full-length cDNAs and then (D) searched for highly similar gene models. Detailed values at each step are described in Tables 1 and 2.
Disrupted genes in the loss-of-function lines
| RATM | TRAPPER | |
|---|---|---|
| Total lines | 17,198 (139) | 16,337 (1,409) |
| (A) Lines determined to have a single insertion point | 15,690 (139) | 8,540 (884) |
| (B) Lines with an insertion in a gene or promoter region | 13,294 (139) | 7,184 (773) |
The number of loss-of-function lines determined to have disrupted genes in each step (see Fig. 1) are shown.
Numbers in parentheses indicate the number of lines observed with any phenotype.
A and B refer to the steps in the workflow shown in Fig. 1.
Overexpressed genes in gain-of-function lines
| AtFOX | OsFOX | |
|---|---|---|
| Total lines | 14,069 (3,429) | 18,605 (5,353) |
| (C) Lines determined to have introduced fl-cDNAs | 8,365 (1,670) | 11,578 (2,940) |
| (D) Lines deduced to have induced genes | 8,357 (1,668) | 10,012 (2,555) |
The number of gain-of-function lines determined to have overexpressed genes in each step (see Fig. 1) are shown.
Numbers in parentheses indicate the number of lines observed with any phenotype.
C and D refer to the steps in the workflow shown in Fig. 1.
Summary of the genes disrupted or induced in loss- and/or gain-of-function lines
| Loss-of-function | Gain-of-function | |||
|---|---|---|---|---|
| RATM | TRAPPER | AtFOX | OsFOX | |
| Genes disrupted or induced per resource | 9,424 (159) | 5,628 (849) | 2,976 (1,176) | 4,177 (1,870) |
| Genes disrupted or induced per mutant type | 12,558 (990) | 6,382 (2,906) | ||
| Genes disrupted or induced in loss- or gain-of-function lines | 16,015 (3,974) | |||
| Genes disrupted and induced in loss- and gain-of-function lines | 2,925 (102) | |||
The numbers of genes disrupted in loss-of-function lines and/or overexpressed in gain-of-function lines are shown.
The numbers in parentheses indicate the number of lines showing any of the observable phenotypes.
Fig. 2Operation workflow of the web-based database. (A) On the top page, users can select the desired resource, fl-cDNAs or mutant lines, or can find mutant lines by phenotype using the ontology tree. Then they can (B) search fl-cDNAs by keywords with filtering from the fl-cDNA search input page, (C) search for mutant lines by keywords with filtering from the mutant line search input page and/or (D) search all mutant lines by selecting phenotype items in the alternative mutant line search input page by phenotype tree. (E) The fl-cDNA search result page shows records that include the fl-cDNA clone name, gene locus name, gene description, the number of mutant lines with disruption or induction of the gene and the mutant lines into which the fl-cDNA was introduced. (F) The mutant line search result page shows records that include the line name, resource name, deduced disrupted or induced gene locus name, gene description and phenotypes observed. (G) The detailed fl-cDNA information page shows the fl-cDNA sequence, its InterPro Scan result, gene models with high degrees of similarity and the names of mutant lines into which the cDNA was introduced. (H) The detailed mutant line information page for the loss-of-function lines shows the visible phenotypes, including the original phenotype description, mapped PO information and PATO information, line photographs and deduced transposon insertion point. (I) The detailed mutant line information page for the gain-of-function lines shows the visible and invisible phenotypes including the original phenotype description, mapped PO information and PATO information, line photographs, the fl-cDNA sequence, its InterPro Scan result and gene models with high degrees of similarity.