| Literature DB >> 22247268 |
Abstract
Despite the widespread use of Arabidopsis (Arabidopsis thaliana) as a model plant, a curated dataset of Arabidopsis genes with mutant phenotypes remains to be established. A preliminary list published nine years ago in Plant Physiology is outdated, and genome-wide phenotype information remains difficult to obtain. We describe here a comprehensive dataset of 2,400 genes with a loss-of-function mutant phenotype in Arabidopsis. Phenotype descriptions were gathered primarily from manual curation of the scientific literature. Genes were placed into prioritized groups (essential, morphological, cellular-biochemical, and conditional) based on the documented phenotypes of putative knockout alleles. Phenotype classes (e.g. vegetative, reproductive, and timing, for the morphological group) and subsets (e.g. flowering time, senescence, circadian rhythms, and miscellaneous, for the timing class) were also established. Gene identities were classified as confirmed (through molecular complementation or multiple alleles) or not confirmed. Relationships between mutant phenotype and protein function, genetic redundancy, protein connectivity, and subcellular protein localization were explored. A complementary dataset of 401 genes that exhibit a mutant phenotype only when disrupted in combination with a putative paralog was also compiled. The importance of these genes in confirming functional redundancy and enhancing the value of single gene datasets is discussed. With further input and curation from the Arabidopsis community, these datasets should help to address a variety of important biological questions, provide a foundation for exploring the relationship between genotype and phenotype in angiosperms, enhance the utility of Arabidopsis as a reference plant, and facilitate comparative studies with model genetic organisms.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22247268 PMCID: PMC3291275 DOI: 10.1104/pp.111.192393
Source DB: PubMed Journal: Plant Physiol ISSN: 0032-0889 Impact factor: 8.340
Figure 1.Classification system for Arabidopsis genes with mutant phenotypes based on a series of unique, prioritized phenotype groups (black headings; complete circles) and classes (circle segments), along with nonexclusive phenotype subsets (abbreviated in rectangles). Phenotype subsets are described in more detail in Supplemental Table S1.
Information presented in the Arabidopsis phenotype dataset
| Dataset Columns | Nature of Information Presented |
| 4 | Locus number; gene name, symbol, aliases |
| 1 | Confirmation status of gene-to-phenotype association |
| 3 | Phenotype group, class, and subset assignments |
| 1 | Brief, curated description of mutant phenotype |
| 1 | Method of gene identification |
| 2 | Reference laboratory and year of publication |
| 3 | Closest BLASTP match within Arabidopsis |
| 2 | Limited protein function information, classification |
| 2 | Mitochondrial and plastid localization information |
Phenotype groups and classes in the Arabidopsis phenotype dataset
| Phenotype Category | Genes in Dataset | Gene Identity Confirmed | |||
| Group | Class | No. | Percentage | No. | Percentage |
| ESN | Subtotal | 719 | 29.9 | 540 | 75.1 |
| Gametophyte | 197 | 8.2 | 136 | 69.0 | |
| Embryo/seed | 370 | 15.4 | 281 | 75.9 | |
| Lethal | 152 | 6.3 | 123 | 80.9 | |
| MRP | Subtotal | 862 | 35.9 | 775 | 89.9 |
| Vegetative | 640 | 26.7 | 572 | 89.2 | |
| Reproductive | 152 | 6.3 | 141 | 92.8 | |
| Timing | 70 | 2.9 | 62 | 88.6 | |
| CLB | Subtotal | 297 | 12.4 | 261 | 87.9 |
| Cellular | 124 | 5.2 | 111 | 89.5 | |
| Biochemical | 173 | 7.2 | 150 | 86.7 | |
| CND | Subtotal | 522 | 21.8 | 445 | 85.2 |
| Physical | 157 | 6.6 | 126 | 80.3 | |
| Chemical | 257 | 10.7 | 229 | 89.1 | |
| Biological | 108 | 4.5 | 90 | 83.3 | |
| Total | All combined | 2,400 | 100.0 | 2,021 | 84.2 |
ESN, Essential; MRP, morphological; CLB, cellular and biochemical; CND, conditional.
Figure 2.Distribution of phenotype subset assignments for Arabidopsis genes with a loss-of-function mutant phenotype. Subsets are colored according to phenotype class (Fig. 1) and numbered as described in Supplemental Table S1. Most essential genes are assigned to a single phenotype subset. Many other genes have more than one subset assignment. Phenotypes of weak alleles and semidominant features observed in heterozygotes are included.
Figure 3.Historical perspective on the identification of Arabidopsis genes with a loss-of-function mutant phenotype through forward and reverse genetics. The year of publication in some cases refers to the date of inclusion in a public database. Additional details are presented in Supplemental Table S2.
Figure 4.Chromosomal locations of 2,400 phenotype genes of Arabidopsis (black lines) placed on a sequence-based physical map of the genome. This figure was generated using the map visualization tool available through TAIR (www.arabidopsis.org/jsp/ChromosomeMap/tool.jsp).
Figure 5.Distribution of phenotype groups among single-copy Arabidopsis phenotype genes with different protein functions. The total numbers of genes analyzed are noted in parentheses.
Phenotypes of pairs of mutants disrupted in genes encoding protein interactors
| Phenotype Group | Percentage of Total Interactors | Matched Pairs | Expected Matched Pairs | Percentage of Pairs Matched | Expected Percentage of Pairs Matched |
| Essential | 45.7 | 22 | 14.6 | 31.4 | 20.9 |
| Morphological | 42.1 | 18 | 12.4 | 25.7 | 17.7 |
| Cellular and biochemical | 3.6 | 1 | 0.1 | 1.4 | 0.1 |
| Conditional | 8.6 | 3 | 0.5 | 4.3 | 0.7 |
| Total | 100.0 | 44 | 27.6 | 62.9 | 39.4 |
Among 140 total interactors from 70 interacting protein pairs encoded by unique genes in the phenotype dataset.
Paired interactors with the same (matched) group assignment among the 70 pairs.
For each phenotype group, Expected Matched Pairs = Expected Percentage of Pairs Matched [or (Percentage of Total Interactors)2/100] × 70 total pairs/100.
Paired interactors have matched group assignments more often than expected based on the frequency of each phenotype group.
Figure 6.Levels of protein sequence redundancy (defined in the text) for Arabidopsis genes assigned to different phenotype groups (left side), all genes in the Arabidopsis phenotype dataset (APD), and the whole Arabidopsis genome (WHG). * For this analysis, genes associated with visible defects in epidermal features (trichomes, stomata, root hairs) were moved from the cellular-biochemical (CLB) group to the morphological (MRP) group. The total numbers of genes evaluated are noted in parentheses.
Features of gene clusters in the multiple mutant dataset
| Genes in Cluster | Cluster Features | Cluster Phenotype Groups | |||||
| Type | Examples | Percentage Complete | ESN | MRP | CLB | CND | |
| 2 | EXC | 87 | 33 | 35 | 34 | 8 | 10 |
| ASY | 70 | 39 | 30 | 27 | 6 | 7 | |
| SYM | 31 | 10 | 17 | 9 | 2 | 3 | |
| 3 | EXC | 7 | 43 | 0 | 2 | 0 | 5 |
| ASY | 5 | 20 | 0 | 3 | 0 | 2 | |
| CPX | 26 | 8 | 6 | 13 | 4 | 3 | |
| 4+ | EXC | 2 | 100 | 0 | 2 | 0 | 0 |
| ASY | 1 | 0 | 0 | 1 | 0 | 0 | |
| CPX | 19 | 0 | 8 | 8 | 2 | 1 | |
EXC, Exclusive, both single mutants have no phenotype; ASY, asymmetric, one single mutant has a phenotype but the multiple mutant is more severe; SYM, symmetric, both single mutants have a phenotype but the multiple mutant is more severe; CPX, complex, phenotype information available for two or more combinations of genes within a cluster.
ESN, Essential; MRP, morphological; CLB, cellular and biochemical; CND, conditional.
Complete clusters disrupt all potential paralogs in Arabidopsis.
Figure 7.Examples of complex clusters of three or more paralogous genes with two or more groupings of genes associated with a multiple mutant phenotype. Genes with a single mutant phenotype are highlighted in yellow. Lines indicate groupings that produce a documented phenotype more severe than that of the corresponding single mutants or multiple mutants with fewer members. Cluster identification numbers are noted in parentheses. Supplemental Table S6 presents additional information on the genes and phenotypes involved.